[00:41] *** JesseW has joined #archiveteam-bs [00:47] *** RichardG has quit IRC (Read error: Operation timed out) [01:09] *** DoomTay has joined #archiveteam-bs [01:38] *** tomwsmf-a has quit IRC (Read error: Operation timed out) [02:05] *** Aranje has joined #archiveteam-bs [02:10] *** Sum has joined #archiveteam-bs [02:10] <Sum> well shit [02:10] <Sum> http://postghost.com/Home/Shutdown/ [02:10] <JesseW> Already heard about it [02:10] <Sum> this is BS [02:10] <Sum> it was a very useful service [02:11] <DoomTay> I didn't even know it existed until now [02:12] <FalconK> neither [02:12] <FalconK> BS indeed though [02:12] <Sum> I think I may have mentioned it a couple days ago when I was following a case [02:12] <Sum> the accountability section of their shutdown statement is so true [02:13] <Sum> without a third-party archive site everyone relies on screenshots of deleted tweets [02:13] <FalconK> there is this european right to be forgotten thing [02:13] <JesseW> I wonder about using hashing as a way to work around this. [02:13] <Sum> people with a clue will use archive.is to capture Google cache results but so many are missed [02:13] <JesseW> Don't display the tweets -- display *hashes* of the tweets (and hashes of the account name) [02:14] <xmc> eh [02:14] <FalconK> same problem with it being hard to search as we have now [02:14] <JesseW> that way only the ones that actually otherwise come to public attention are saved -- but they can be *verified* [02:14] <FalconK> makes it very hard to do retrospective analysis [02:15] <xmc> so [02:15] <xmc> if you click the "embed tweet" button [02:15] <JesseW> eh, it's a way to separate verification from publishing [02:15] <xmc> it gives you the full text in html [02:15] <xmc> so you can read it even after it's deleted [02:15] <xmc> it's been this way for years [02:15] <FalconK> there are loads of such archivings of tweets [02:15] <xmc> y e a r s [02:16] <FalconK> I noted a bunch of tweets Cathy Brennan made were archived in this way even after her account is gone, on pages noting the strange sayings she has said [02:16] <FalconK> it's very effective [02:17] <JesseW> it would be ... interesting ... to see twitter try to claim that "something with hash XXX was written by account named hash YYY" is in violation of their agreement. [02:17] <xmc> yeah [02:17] <xmc> JesseW: you're just trying to out-nerd them [02:17] <FalconK> it would [02:17] <xmc> this won't lead anywhere interesting [02:17] <FalconK> but it is odd enough for them to yowl that nobody may keep a record of what they saw [02:17] <FalconK> that is some RIAA-level shit right there [02:17] <xmc> yep [02:18] <JesseW> oh? Why wouldn't it lead to interesting law? [02:18] <DoomTay> I mean, the POTUS's twitter expressly allows archiving tweets from there [02:18] <FalconK> it would never get to court [02:18] <FalconK> any lawyer worth their scotch would drag that through procedure to avoid that coming up in open court [02:19] <xmc> a smart judge would say "you're trying to be clever but i know what you are doing and it's actually stupid so fuck off" [02:19] <xmc> s/smart/sharp/ [02:19] <FalconK> most judges lack clue [02:19] <FalconK> if it was big enough, though, perhaps the EFF might get interested and write a brief [02:20] <xmc> tru [02:20] <FalconK> that is their thing [02:20] <FalconK> in fact have postghost contacted the EFF I wonder? [02:20] <FalconK> they should [02:20] <JesseW> eh, I'm too hungry right now to continue this argument. So ... conceded, have a nice day. [02:24] <Sum> the thing is they mention another deleted tweets archive, Politwoops, hasn't been hit with a shutdown [02:24] <Sum> they're picking and choosing which sites they want to claim violate their dev agreement [02:24] <DoomTay> I wouldn't take that for granted though [02:25] <xmc> politwoops was shutdown and then reinstated [02:25] <JesseW> (and they mention that, and explain why) [02:26] <Sum> shouldn't there be a loophole to their agreement policy? [02:28] <Sum> that is, what's stopping an archive site like archive.org (or someone else) archiving the deleted tweets *before* the main site officially deletes them [02:29] <Sum> so the main site can say they did 'update' their public feeds to match the originals [02:30] <yipdw> FalconK: probably sending the abort signal is fine. another thing I never really figured out with seesaw was how to mark a job as failed [02:30] <yipdw> the warrior system does this occasionally [02:30] <yipdw> but we rarely ever need that in Warrior projects because the better solution is "requeue" [02:31] <yipdw> people have told me to move archivebot away from seesaw and I don't think that's a bad idea, it's just something I can never get myself started on [02:35] *** ravetcofx has quit IRC (Ping timeout: 506 seconds) [02:36] <Aranje> random Q: do any of you know if a kindle paperwhite (perhaps the latest one) will actually display scanned images? Like if one scans a book into a pdf and each page is an image [02:36] <Aranje> or for that usage should some other object be picked [02:38] <Aranje> a friend is trying to figure out what device to take to a sunny sandy place with potentially zero internet :) [02:38] <yipdw> swimsuit [02:39] <Aranje> already packed, but wants to take scanned ebooks as it'll be a long stay [02:39] <yipdw> ah [02:39] <yipdw> http://www.dummies.com/how-to/content/how-to-read-pdf-documents-on-your-kindle-paperwhit.html says the kindle paperwhite is capable of displaying PDFs [02:40] <yipdw> (I have no idea why that was my first search hit; maybe Google is telling me something) [02:40] <Aranje> yeah I know they're supposed to be idiotproof, but it was my understanding that the previous paperwhites could not do what I'm asking [02:41] <Aranje> but perhaps the one just released last month can [02:41] <Aranje> I don't know, nobody says [02:41] <Aranje> and dear lord reading random internet forums full of people with unrelated-to-the-question comments is not how I want to spend my friday [02:41] <Aranje> (lol) [02:44] <yipdw> stack overflow eh [02:47] <Aranje> heh [02:47] <Aranje> I learned this afternoon that kindle has userforums [03:16] *** Sum has quit IRC (Ping timeout: 370 seconds) [03:17] *** Sum has joined #archiveteam-bs [03:29] *** Sum has quit IRC (Ping timeout: 370 seconds) [03:30] *** Sum has joined #archiveteam-bs [03:36] <ranma> i like Stack Overflow [03:36] <ranma> or the series of sites [03:36] <Frogging> same [03:36] <ranma> i think StackExchange is the parent site? [03:36] <Frogging> it's like Yahoo Answers but useful [03:38] <ranma> and almost naziistically moderated [03:40] <Frogging> I like that moderators are just high ranking community members that have made a lot of contributions [03:56] <Aranje> I guess we'll order it and find out shortly. fortunately there's time before they leave :) [04:03] *** Sk1d has quit IRC (Ping timeout: 250 seconds) [04:03] <pikhq> Aranje: That's definitely incorrect: nas far as I can tell *every* Kindle can display PDFs. [04:04] <pikhq> Though it' [04:04] <pikhq> s a bit less useful for some of the older ones (low res screen makes it not a nice experience) [04:04] <Aranje> Yeah, I know the file format is supported and all that... but non-terrible experience w/image-pages is a different thing [04:05] <Aranje> we're going to see if acrobat's ocr button can help too [04:05] <Aranje> turns out their workplace has their printers set up where you can scan stuff in and it'll autopdf it into an email for you [04:06] <Aranje> so... maybe run them through acrobat to see if the filesize can come down [04:06] <Aranje> then put them on the paperwhite and see if it's usable [04:10] *** Sk1d has joined #archiveteam-bs [04:38] *** ravetcofx has joined #archiveteam-bs [04:40] *** Sum has quit IRC (Ping timeout: 370 seconds) [04:41] *** Sum has joined #archiveteam-bs [05:00] *** Sk1d has quit IRC (Ping timeout: 194 seconds) [05:04] *** Sum has quit IRC (Ping timeout: 370 seconds) [05:08] *** Sk1d has joined #archiveteam-bs [05:08] *** Sk1d has quit IRC (Connection closed) [05:10] *** Sk1d has joined #archiveteam-bs [05:17] *** DiscantX has joined #archiveteam-bs [05:31] *** VADemon has joined #archiveteam-bs [05:40] *** DiscantX has quit IRC (Ping timeout: 244 seconds) [06:09] *** vtyl has joined #archiveteam-bs [06:18] *** lytv has quit IRC (Read error: Operation timed out) [06:38] *** JesseW has quit IRC (Ping timeout: 370 seconds) [07:04] *** Sum has joined #archiveteam-bs [07:10] *** DoomTay has quit IRC (Quit: Page closed) [07:22] *** Sum has quit IRC (Ping timeout: 370 seconds) [07:23] *** Sum has joined #archiveteam-bs [07:45] *** Sum has quit IRC (Ping timeout: 370 seconds) [07:46] *** Sum has joined #archiveteam-bs [08:00] *** Sum has quit IRC (Ping timeout: 370 seconds) [08:01] *** Sum has joined #archiveteam-bs [08:14] *** BlueMaxim has joined #archiveteam-bs [08:40] *** Arcai has joined #archiveteam-bs [09:24] *** Sum has quit IRC (Read error: Operation timed out) [09:24] *** Sum has joined #archiveteam-bs [09:32] *** Sum has quit IRC (Ping timeout: 370 seconds) [09:33] *** Sum has joined #archiveteam-bs [10:04] *** Sum has quit IRC (Ping timeout: 370 seconds) [10:05] *** Sum has joined #archiveteam-bs [10:06] *** Arcai has quit IRC (Ping timeout: 268 seconds) [10:16] *** ArgyroNet has joined #archiveteam-bs [10:16] <ArgyroNet> hello there [10:17] <ArgyroNet> any native english-speaking people around ? [10:17] <ArgyroNet> or just anyone with a good level [10:19] <ArgyroNet> is "we are still deeply in need of your support" a correct phrase ? [10:20] <dxrt> Yeah, that's fine. [10:21] <ArgyroNet> thanks :) [10:22] *** dxrt- sets mode: +o dxrt [10:26] *** Sum has quit IRC (Ping timeout: 370 seconds) [10:26] *** Sum has joined #archiveteam-bs [10:29] *** VADemon has quit IRC (Quit: left4dead) [10:31] *** RichardG has joined #archiveteam-bs [10:35] *** ArgyroNet has quit IRC (Quit: thanks :)) [10:54] *** Sum has quit IRC (Ping timeout: 370 seconds) [10:55] *** Sum has joined #archiveteam-bs [11:06] *** fie has quit IRC (Ping timeout: 370 seconds) [11:17] *** Sum has quit IRC (Ping timeout: 370 seconds) [11:18] *** Sum has joined #archiveteam-bs [11:33] *** zhongfu has quit IRC (Ping timeout: 260 seconds) [11:38] *** zhongfu has joined #archiveteam-bs [11:44] *** zhongfu has quit IRC (Remote host closed the connection) [11:58] *** zhongfu has joined #archiveteam-bs [12:03] *** zhongfu has quit IRC (Remote host closed the connection) [12:15] *** zhongfu has joined #archiveteam-bs [12:25] *** Sum has quit IRC (Ping timeout: 370 seconds) [12:26] *** Sum has joined #archiveteam-bs [12:45] *** RichardG_ has joined #archiveteam-bs [12:45] *** RichardG has quit IRC (Read error: Connection reset by peer) [12:54] *** zhongfu has quit IRC (Remote host closed the connection) [12:55] *** Sum has quit IRC (Ping timeout: 370 seconds) [12:56] *** zhongfu has joined #archiveteam-bs [12:56] *** mutoso has quit IRC (Read error: Operation timed out) [12:56] *** Sum has joined #archiveteam-bs [12:57] *** mutoso has joined #archiveteam-bs [13:09] *** vitzli has joined #archiveteam-bs [13:21] *** BlueMaxim has quit IRC (Quit: Leaving) [13:28] *** Sum has quit IRC (Ping timeout: 370 seconds) [13:29] *** Sum has joined #archiveteam-bs [13:37] *** Sum has quit IRC (Ping timeout: 370 seconds) [13:37] *** Sum has joined #archiveteam-bs [13:54] *** RichardG_ has quit IRC (Read error: Operation timed out) [13:55] *** RichardG has joined #archiveteam-bs [14:00] *** Sum has quit IRC (Ping timeout: 370 seconds) [14:01] *** Sum has joined #archiveteam-bs [14:21] *** Sum has quit IRC (Ping timeout: 370 seconds) [14:22] *** Sum has joined #archiveteam-bs [14:38] *** Sum has quit IRC (Ping timeout: 370 seconds) [14:44] *** Sum has joined #archiveteam-bs [14:56] *** metalcamp has joined #archiveteam-bs [15:06] *** Sum has quit IRC (Ping timeout: 370 seconds) [15:07] *** DoomTay has joined #archiveteam-bs [15:07] *** Sum has joined #archiveteam-bs [15:24] <DoomTay> !ao https://youtu.be/gvuqLylQOoE --youtube-dl [15:25] <dashcloud> ranma: we got a portion of the files section before AOL broke all the links to the files- you can still see descriptions if you go to the libraries, but there's no way to reach the files anymore [15:27] <dashcloud> This search covers the list pretty well: https://archive.org/search.php?query=aol+files [15:36] *** Sum has quit IRC (Ping timeout: 370 seconds) [15:37] *** Sum has joined #archiveteam-bs [15:46] <DoomTay> So, how goes development of that examiner script? [15:50] *** Aranje has quit IRC (Quit: Three sheets to the wind) [15:50] *** JesseW has joined #archiveteam-bs [16:09] *** JesseW has quit IRC (Ping timeout: 370 seconds) [16:14] *** SN4T14 has quit IRC (Ping timeout: 370 seconds) [16:38] *** Sum has quit IRC (Read error: Operation timed out) [16:38] *** Sum has joined #archiveteam-bs [16:52] *** RichardG has quit IRC (Quit: Keyboard not found, press F1 to continue) [16:53] *** RichardG has joined #archiveteam-bs [16:58] *** mutoso has quit IRC (Read error: Operation timed out) [17:04] *** DiscantX has joined #archiveteam-bs [17:06] *** mutoso has joined #archiveteam-bs [17:06] *** Sum has quit IRC (Ping timeout: 370 seconds) [17:07] *** Sum has joined #archiveteam-bs [17:13] <DoomTay> Oh wow [17:13] <DoomTay> https://github.com/chrislgarry/Apollo-11 [17:15] *** Sum has quit IRC (Ping timeout: 370 seconds) [17:16] *** Sum has joined #archiveteam-bs [17:55] *** Sue_ has quit IRC (Read error: Operation timed out) [18:06] *** Sum has quit IRC (Ping timeout: 370 seconds) [18:08] *** Sue_ has joined #archiveteam-bs [18:09] *** tomwsmf-a has joined #archiveteam-bs [18:18] <bwn> doomtay: agc in js: http://svtsim.com/moonjs/agc.html [18:19] <bwn> also very neat :) [18:27] <MrRadar> Archive Team is currently number 1 on the Hacker News front page: https://news.ycombinator.com/item?id=12062116 [18:27] <MrRadar> About Coursera [18:36] <PurpleSym> Someone should tell them that courses are browseable through the Wayback Machine, aren’t they? [18:36] *** JesseW has joined #archiveteam-bs [18:37] *** RedType has quit IRC (Ping timeout: 260 seconds) [18:37] <joepie91> PurpleSym: people having issues with that, judging from the comments [18:38] <PurpleSym> That’s Because the link refers to the IA item, not the Wayback Machine, as far as I understand. [18:41] <PurpleSym> Oh, there’s a robots.txt: https://web.archive.org/web/*/https://d396qusza40orc.cloudfront.net/virology/lecture_slides/W010_S001_virology.pdf [18:47] *** DiscantX has quit IRC (Ping timeout: 244 seconds) [18:48] *** RedType has joined #archiveteam-bs [18:49] <PurpleSym> And playback apparently does not work: https://web.archive.org/web/20160627062435/https://class.coursera.org/bigdata-004 ? [19:18] *** DiscantX has joined #archiveteam-bs [19:18] *** tomwsmf-a has quit IRC (Read error: Operation timed out) [19:26] *** vitzli has quit IRC (Leaving) [19:30] <godane> i'm uploading more korea news: https://archive.org/details/koreanet-1_changwon_newsplaza-20030101 [19:58] <arkiver> PurpleSym: all info is saved though, have a look at the source code. [19:59] <arkiver> It is possible though to make this running in the wayback machine, it works in webarchiveplayer [20:00] <PurpleSym> I can’t tell why it is not working in the Wayback Machine, arkiver. [20:01] <arkiver> it might because URLs like https://www.coursera.org/eventing/info?key=page.error&value=%7B%22status%22%3A404%2C%22url%22%3A%22https%3A%2F%2Fwayback-beta.archive.org%2Fweb%2F20160627062435%2Fhttps%3A%2F%2Fclass.coursera.org%2Fbigdata-004%22%7D&user=19902603&session=7842137219-1467217133015&client=spark&url=https%3A%2F%2Fwayback-beta.archive.org%2Fweb%2F20160627062435%2Fhttps%3A%2F%2Fclass.coursera.org%2Fbigdata-004&time=1468094448063&screen=%7B%22he [20:01] <arkiver> 3A1050%2C%22width%22%3A1680%7D [20:01] <arkiver> are still requested through www.coursera.org [20:01] <arkiver> instead of the wayback machine [20:01] <arkiver> along with other files that are not requested through the wayback machine [20:02] <arkiver> webarchiveplayer seems to be requesting everything through webarchiveplayer [20:02] <PurpleSym> I’ve seen those requests and thought they were sent because something failed before that. [20:03] <arkiver> might be possible [20:04] <arkiver> the wayback machine shouldn't request the .js from the original location though [20:05] <PurpleSym> Of course. There’s also a few requests to cloudfront.net. [20:05] <PurpleSym> And another one to https://web.archive.org/web/20160627062435/https://class.coursera.org/bigdata-004/data/api/reports/end_of_course_stories.json which does not seem to be in the WARCs we got. [20:05] <arkiver> Right [20:06] <arkiver> Strange we didn't get that, before starting the project I made sure we got everything [20:06] *** Sum has joined #archiveteam-bs [20:06] <arkiver> not having that URL saved shouldn't be much of a problem though, since the projects I tested did work with the webarchiveplayer [20:09] *** anjacks0n has joined #archiveteam-bs [20:21] *** Sum has quit IRC (Ping timeout: 370 seconds) [20:22] *** Sum has joined #archiveteam-bs [20:27] *** Start has quit IRC (Quit: Disconnected.) [20:30] *** Start has joined #archiveteam-bs [20:42] *** Sum has quit IRC (Ping timeout: 370 seconds) [20:48] *** Sum has joined #archiveteam-bs [20:57] *** DiscantX has quit IRC (Ping timeout: 244 seconds) [20:57] *** ArgyroNet has joined #archiveteam-bs [20:57] *** metal_cam has joined #archiveteam-bs [21:00] *** metalcamp has quit IRC (Ping timeout: 244 seconds) [21:05] *** anjacks0n has quit IRC (anjacks0n) [21:07] *** metal_cam has quit IRC (Ping timeout: 244 seconds) [21:24] *** Sum has quit IRC (Ping timeout: 370 seconds) [21:25] *** Sum has joined #archiveteam-bs [21:37] *** anjacks0n has joined #archiveteam-bs [21:47] *** godane has quit IRC (Leaving.) [22:00] *** Sum has quit IRC (Ping timeout: 370 seconds) [22:01] *** Sum has joined #archiveteam-bs [22:16] *** ndiddy has joined #archiveteam-bs [22:18] *** DoomTay has quit IRC (Quit: Page closed) [22:22] *** Jeroen52 has joined #archiveteam-bs [22:27] <luckcolor> here's something interestic for parsung web pages: https://github.com/mozilla/fathom [22:28] <luckcolor> *inteeresting [22:28] <luckcolor> **interesting [22:28] <HCross> parsing as well [22:29] <luckcolor> OMG lol [22:29] * HCross hands luckcolor a dictionary [22:29] <ArgyroNet> that's amazic :) [22:29] * luckcolor definetely has to change this keyboard [22:29] <luckcolor> Sigh [22:31] <ArgyroNet> anyway, it doesn't matter, luckcolor [22:31] <ArgyroNet> thanks for sharung hue hue hue hue [22:31] <luckcolor> np [22:32] <luckcolor> funny thing that will always come to my mind during these times: those irc logs will definetely land on archive.org someday so that my spelling mistakes can be kept indefinetely [22:32] <luckcolor> oh boy [22:32] <luckcolor> :P [22:35] <ArgyroNet> hey, that's not the proper way to think [22:35] <ArgyroNet> the proper way is "I'll have a proof that I bettered my writing !" [22:35] <ArgyroNet> :p [22:39] <ArgyroNet> anyway, ++ [22:39] *** ArgyroNet has quit IRC (Quit: Once you know what cake you want to be true, instinct is a very useful device for enabling you to know that it is) [23:21] *** Sum has quit IRC (Ping timeout: 370 seconds) [23:22] *** Sum has joined #archiveteam-bs [23:22] *** godane has joined #archiveteam-bs [23:24] *** anjacks0n has quit IRC (anjacks0n) [23:29] *** Sum has quit IRC (Ping timeout: 370 seconds) [23:30] *** Sum has joined #archiveteam-bs [23:33] *** anjacks0n has joined #archiveteam-bs [23:47] *** anjacks0n has quit IRC (anjacks0n)