[00:41] *** JesseW has joined #archiveteam-bs
[00:47] *** RichardG has quit IRC (Read error: Operation timed out)
[01:09] *** DoomTay has joined #archiveteam-bs
[01:38] *** tomwsmf-a has quit IRC (Read error: Operation timed out)
[02:05] *** Aranje has joined #archiveteam-bs
[02:10] *** Sum has joined #archiveteam-bs
[02:10] <Sum> well shit
[02:10] <Sum> http://postghost.com/Home/Shutdown/
[02:10] <JesseW> Already heard about it
[02:10] <Sum> this is BS
[02:10] <Sum> it was a very useful service
[02:11] <DoomTay> I didn't even know it existed until now
[02:12] <FalconK> neither
[02:12] <FalconK> BS indeed though
[02:12] <Sum> I think I may have mentioned it a couple days ago when I was following a case
[02:12] <Sum> the accountability section of their shutdown statement is so true
[02:13] <Sum> without a third-party archive site everyone relies on screenshots of deleted tweets
[02:13] <FalconK> there is this european right to be forgotten thing
[02:13] <JesseW> I wonder about using hashing as a way to work around this.
[02:13] <Sum> people with a clue will use archive.is to capture Google cache results but so many are missed
[02:13] <JesseW> Don't display the tweets -- display *hashes* of the tweets (and hashes of the account name)
[02:14] <xmc> eh
[02:14] <FalconK> same problem with it being hard to search as we have now
[02:14] <JesseW> that way only the ones that actually otherwise come to public attention are saved -- but they can be *verified*
[02:14] <FalconK> makes it very hard to do retrospective analysis
[02:15] <xmc> so
[02:15] <xmc> if you click the "embed tweet" button
[02:15] <JesseW> eh, it's a way to separate verification from publishing
[02:15] <xmc> it gives you the full text in html
[02:15] <xmc> so you can read it even after it's deleted
[02:15] <xmc> it's been this way for years
[02:15] <FalconK> there are loads of such archivings of tweets
[02:15] <xmc> y e a r s
[02:16] <FalconK> I noted a bunch of tweets Cathy Brennan made were archived in this way even after her account is gone, on pages noting the strange sayings she has said
[02:16] <FalconK> it's very effective
[02:17] <JesseW> it would be ... interesting ... to see twitter try to claim that "something with hash XXX was written by account named hash YYY" is in violation of their agreement.
[02:17] <xmc> yeah
[02:17] <xmc> JesseW: you're just trying to out-nerd them
[02:17] <FalconK> it would
[02:17] <xmc> this won't lead anywhere interesting
[02:17] <FalconK> but it is odd enough for them to yowl that nobody may keep a record of what they saw
[02:17] <FalconK> that is some RIAA-level shit right there
[02:17] <xmc> yep
[02:18] <JesseW> oh? Why wouldn't it lead to interesting law?
[02:18] <DoomTay> I mean, the POTUS's twitter expressly allows archiving tweets from there
[02:18] <FalconK> it would never get to court
[02:18] <FalconK> any lawyer worth their scotch would drag that through procedure to avoid that coming up in open court
[02:19] <xmc> a smart judge would say "you're trying to be clever but i know what you are doing and it's actually stupid so fuck off"
[02:19] <xmc> s/smart/sharp/
[02:19] <FalconK> most judges lack clue
[02:19] <FalconK> if it was big enough, though, perhaps the EFF might get interested and write a brief
[02:20] <xmc> tru
[02:20] <FalconK> that is their thing
[02:20] <FalconK> in fact have postghost contacted the EFF I wonder?
[02:20] <FalconK> they should
[02:20] <JesseW> eh, I'm too hungry right now to continue this argument. So ... conceded, have a nice day.
[02:24] <Sum> the thing is they mention another deleted tweets archive, Politwoops, hasn't been hit with a shutdown
[02:24] <Sum> they're picking and choosing which sites they want to claim violate their dev agreement
[02:24] <DoomTay> I wouldn't take that for granted though
[02:25] <xmc> politwoops was shutdown and then reinstated
[02:25] <JesseW> (and they mention that, and explain why)
[02:26] <Sum> shouldn't there be a loophole to their agreement policy?
[02:28] <Sum> that is, what's stopping an archive site like archive.org (or someone else) archiving the deleted tweets *before* the main site officially deletes them
[02:29] <Sum> so the main site can say they did 'update' their public feeds to match the originals
[02:30] <yipdw> FalconK: probably sending the abort signal is fine.  another thing I never really figured out with seesaw was how to mark a job as failed
[02:30] <yipdw> the warrior system does this occasionally
[02:30] <yipdw> but we rarely ever need that in Warrior projects because the better solution is "requeue"
[02:31] <yipdw> people have told me to move archivebot away from seesaw and I don't think that's a bad idea, it's just something I can never get myself started on
[02:35] *** ravetcofx has quit IRC (Ping timeout: 506 seconds)
[02:36] <Aranje> random Q: do any of you know if a kindle paperwhite (perhaps the latest one) will actually display scanned images? Like if one scans a book into a pdf and each page is an image
[02:36] <Aranje> or for that usage should some other object be picked
[02:38] <Aranje> a friend is trying to figure out what device to take to a sunny sandy place with potentially zero internet :)
[02:38] <yipdw> swimsuit
[02:39] <Aranje> already packed, but wants to take scanned ebooks as it'll be a long stay
[02:39] <yipdw> ah
[02:39] <yipdw> http://www.dummies.com/how-to/content/how-to-read-pdf-documents-on-your-kindle-paperwhit.html says the kindle paperwhite is capable of displaying PDFs
[02:40] <yipdw> (I have no idea why that was my first search hit; maybe Google is telling me something)
[02:40] <Aranje> yeah I know they're supposed to be idiotproof, but it was my understanding that the previous paperwhites could not do what I'm asking
[02:41] <Aranje> but perhaps the one just released last month can
[02:41] <Aranje> I don't know, nobody says
[02:41] <Aranje> and dear lord reading random internet forums full of people with unrelated-to-the-question comments is not how I want to spend my friday
[02:41] <Aranje> (lol)
[02:44] <yipdw> stack overflow eh
[02:47] <Aranje> heh
[02:47] <Aranje> I learned this afternoon that kindle has userforums
[03:16] *** Sum has quit IRC (Ping timeout: 370 seconds)
[03:17] *** Sum has joined #archiveteam-bs
[03:29] *** Sum has quit IRC (Ping timeout: 370 seconds)
[03:30] *** Sum has joined #archiveteam-bs
[03:36] <ranma> i like Stack Overflow
[03:36] <ranma> or the series of sites
[03:36] <Frogging> same
[03:36] <ranma> i think StackExchange is the parent site?
[03:36] <Frogging> it's like Yahoo Answers but useful
[03:38] <ranma> and almost naziistically moderated
[03:40] <Frogging> I like that moderators are just high ranking community members that have made a lot of contributions 
[03:56] <Aranje> I guess we'll order it and find out shortly. fortunately there's time before they leave :)
[04:03] *** Sk1d has quit IRC (Ping timeout: 250 seconds)
[04:03] <pikhq> Aranje: That's definitely incorrect: nas far as I can tell *every* Kindle can display PDFs.
[04:04] <pikhq> Though it'
[04:04] <pikhq> s a bit less useful for some of the older ones (low res screen makes it not a nice experience)
[04:04] <Aranje> Yeah, I know the file format is supported and all that... but non-terrible experience w/image-pages is a different thing
[04:05] <Aranje> we're going to see if acrobat's ocr button can help too
[04:05] <Aranje> turns out their workplace has their printers set up where you can scan stuff in and it'll autopdf it into an email for you
[04:06] <Aranje> so... maybe run them through acrobat to see if the filesize can come down
[04:06] <Aranje> then put them on the paperwhite and see if it's usable
[04:10] *** Sk1d has joined #archiveteam-bs
[04:38] *** ravetcofx has joined #archiveteam-bs
[04:40] *** Sum has quit IRC (Ping timeout: 370 seconds)
[04:41] *** Sum has joined #archiveteam-bs
[05:00] *** Sk1d has quit IRC (Ping timeout: 194 seconds)
[05:04] *** Sum has quit IRC (Ping timeout: 370 seconds)
[05:08] *** Sk1d has joined #archiveteam-bs
[05:08] *** Sk1d has quit IRC (Connection closed)
[05:10] *** Sk1d has joined #archiveteam-bs
[05:17] *** DiscantX has joined #archiveteam-bs
[05:31] *** VADemon has joined #archiveteam-bs
[05:40] *** DiscantX has quit IRC (Ping timeout: 244 seconds)
[06:09] *** vtyl has joined #archiveteam-bs
[06:18] *** lytv has quit IRC (Read error: Operation timed out)
[06:38] *** JesseW has quit IRC (Ping timeout: 370 seconds)
[07:04] *** Sum has joined #archiveteam-bs
[07:10] *** DoomTay has quit IRC (Quit: Page closed)
[07:22] *** Sum has quit IRC (Ping timeout: 370 seconds)
[07:23] *** Sum has joined #archiveteam-bs
[07:45] *** Sum has quit IRC (Ping timeout: 370 seconds)
[07:46] *** Sum has joined #archiveteam-bs
[08:00] *** Sum has quit IRC (Ping timeout: 370 seconds)
[08:01] *** Sum has joined #archiveteam-bs
[08:14] *** BlueMaxim has joined #archiveteam-bs
[08:40] *** Arcai has joined #archiveteam-bs
[09:24] *** Sum has quit IRC (Read error: Operation timed out)
[09:24] *** Sum has joined #archiveteam-bs
[09:32] *** Sum has quit IRC (Ping timeout: 370 seconds)
[09:33] *** Sum has joined #archiveteam-bs
[10:04] *** Sum has quit IRC (Ping timeout: 370 seconds)
[10:05] *** Sum has joined #archiveteam-bs
[10:06] *** Arcai has quit IRC (Ping timeout: 268 seconds)
[10:16] *** ArgyroNet has joined #archiveteam-bs
[10:16] <ArgyroNet> hello there
[10:17] <ArgyroNet> any native english-speaking people around ?
[10:17] <ArgyroNet> or just anyone with a good level
[10:19] <ArgyroNet> is "we are still deeply in need of your support" a correct phrase ?
[10:20] <dxrt> Yeah, that's fine.
[10:21] <ArgyroNet> thanks :)
[10:22] *** dxrt- sets mode: +o dxrt
[10:26] *** Sum has quit IRC (Ping timeout: 370 seconds)
[10:26] *** Sum has joined #archiveteam-bs
[10:29] *** VADemon has quit IRC (Quit: left4dead)
[10:31] *** RichardG has joined #archiveteam-bs
[10:35] *** ArgyroNet has quit IRC (Quit: thanks :))
[10:54] *** Sum has quit IRC (Ping timeout: 370 seconds)
[10:55] *** Sum has joined #archiveteam-bs
[11:06] *** fie has quit IRC (Ping timeout: 370 seconds)
[11:17] *** Sum has quit IRC (Ping timeout: 370 seconds)
[11:18] *** Sum has joined #archiveteam-bs
[11:33] *** zhongfu has quit IRC (Ping timeout: 260 seconds)
[11:38] *** zhongfu has joined #archiveteam-bs
[11:44] *** zhongfu has quit IRC (Remote host closed the connection)
[11:58] *** zhongfu has joined #archiveteam-bs
[12:03] *** zhongfu has quit IRC (Remote host closed the connection)
[12:15] *** zhongfu has joined #archiveteam-bs
[12:25] *** Sum has quit IRC (Ping timeout: 370 seconds)
[12:26] *** Sum has joined #archiveteam-bs
[12:45] *** RichardG_ has joined #archiveteam-bs
[12:45] *** RichardG has quit IRC (Read error: Connection reset by peer)
[12:54] *** zhongfu has quit IRC (Remote host closed the connection)
[12:55] *** Sum has quit IRC (Ping timeout: 370 seconds)
[12:56] *** zhongfu has joined #archiveteam-bs
[12:56] *** mutoso has quit IRC (Read error: Operation timed out)
[12:56] *** Sum has joined #archiveteam-bs
[12:57] *** mutoso has joined #archiveteam-bs
[13:09] *** vitzli has joined #archiveteam-bs
[13:21] *** BlueMaxim has quit IRC (Quit: Leaving)
[13:28] *** Sum has quit IRC (Ping timeout: 370 seconds)
[13:29] *** Sum has joined #archiveteam-bs
[13:37] *** Sum has quit IRC (Ping timeout: 370 seconds)
[13:37] *** Sum has joined #archiveteam-bs
[13:54] *** RichardG_ has quit IRC (Read error: Operation timed out)
[13:55] *** RichardG has joined #archiveteam-bs
[14:00] *** Sum has quit IRC (Ping timeout: 370 seconds)
[14:01] *** Sum has joined #archiveteam-bs
[14:21] *** Sum has quit IRC (Ping timeout: 370 seconds)
[14:22] *** Sum has joined #archiveteam-bs
[14:38] *** Sum has quit IRC (Ping timeout: 370 seconds)
[14:44] *** Sum has joined #archiveteam-bs
[14:56] *** metalcamp has joined #archiveteam-bs
[15:06] *** Sum has quit IRC (Ping timeout: 370 seconds)
[15:07] *** DoomTay has joined #archiveteam-bs
[15:07] *** Sum has joined #archiveteam-bs
[15:24] <DoomTay> !ao https://youtu.be/gvuqLylQOoE --youtube-dl
[15:25] <dashcloud> ranma: we got a portion of the files section before AOL broke all the links to the files- you can still see descriptions if you go to the libraries, but there's no way to reach the files anymore
[15:27] <dashcloud> This search covers the list pretty well: https://archive.org/search.php?query=aol+files
[15:36] *** Sum has quit IRC (Ping timeout: 370 seconds)
[15:37] *** Sum has joined #archiveteam-bs
[15:46] <DoomTay> So, how goes development of that examiner script?
[15:50] *** Aranje has quit IRC (Quit: Three sheets to the wind)
[15:50] *** JesseW has joined #archiveteam-bs
[16:09] *** JesseW has quit IRC (Ping timeout: 370 seconds)
[16:14] *** SN4T14 has quit IRC (Ping timeout: 370 seconds)
[16:38] *** Sum has quit IRC (Read error: Operation timed out)
[16:38] *** Sum has joined #archiveteam-bs
[16:52] *** RichardG has quit IRC (Quit: Keyboard not found, press F1 to continue)
[16:53] *** RichardG has joined #archiveteam-bs
[16:58] *** mutoso has quit IRC (Read error: Operation timed out)
[17:04] *** DiscantX has joined #archiveteam-bs
[17:06] *** mutoso has joined #archiveteam-bs
[17:06] *** Sum has quit IRC (Ping timeout: 370 seconds)
[17:07] *** Sum has joined #archiveteam-bs
[17:13] <DoomTay> Oh wow
[17:13] <DoomTay> https://github.com/chrislgarry/Apollo-11
[17:15] *** Sum has quit IRC (Ping timeout: 370 seconds)
[17:16] *** Sum has joined #archiveteam-bs
[17:55] *** Sue_ has quit IRC (Read error: Operation timed out)
[18:06] *** Sum has quit IRC (Ping timeout: 370 seconds)
[18:08] *** Sue_ has joined #archiveteam-bs
[18:09] *** tomwsmf-a has joined #archiveteam-bs
[18:18] <bwn> doomtay: agc in js: http://svtsim.com/moonjs/agc.html
[18:19] <bwn> also very neat :)
[18:27] <MrRadar> Archive Team is currently number 1 on the Hacker News front page: https://news.ycombinator.com/item?id=12062116
[18:27] <MrRadar> About Coursera
[18:36] <PurpleSym> Someone should tell them that courses are browseable through the Wayback Machine, aren’t they?
[18:36] *** JesseW has joined #archiveteam-bs
[18:37] *** RedType has quit IRC (Ping timeout: 260 seconds)
[18:37] <joepie91> PurpleSym: people having issues with that, judging from the comments
[18:38] <PurpleSym> That’s Because the link refers to the IA item, not the Wayback Machine, as far as I understand.
[18:41] <PurpleSym> Oh, there’s a robots.txt: https://web.archive.org/web/*/https://d396qusza40orc.cloudfront.net/virology/lecture_slides/W010_S001_virology.pdf
[18:47] *** DiscantX has quit IRC (Ping timeout: 244 seconds)
[18:48] *** RedType has joined #archiveteam-bs
[18:49] <PurpleSym> And playback apparently does not work: https://web.archive.org/web/20160627062435/https://class.coursera.org/bigdata-004 ?
[19:18] *** DiscantX has joined #archiveteam-bs
[19:18] *** tomwsmf-a has quit IRC (Read error: Operation timed out)
[19:26] *** vitzli has quit IRC (Leaving)
[19:30] <godane> i'm uploading more korea news: https://archive.org/details/koreanet-1_changwon_newsplaza-20030101
[19:58] <arkiver> PurpleSym: all info is saved though, have a look at the source code.
[19:59] <arkiver> It is possible though to make this running in the wayback machine, it works in webarchiveplayer
[20:00] <PurpleSym> I can’t tell why it is not working in the Wayback Machine, arkiver.
[20:01] <arkiver> it might because URLs like https://www.coursera.org/eventing/info?key=page.error&value=%7B%22status%22%3A404%2C%22url%22%3A%22https%3A%2F%2Fwayback-beta.archive.org%2Fweb%2F20160627062435%2Fhttps%3A%2F%2Fclass.coursera.org%2Fbigdata-004%22%7D&user=19902603&session=7842137219-1467217133015&client=spark&url=https%3A%2F%2Fwayback-beta.archive.org%2Fweb%2F20160627062435%2Fhttps%3A%2F%2Fclass.coursera.org%2Fbigdata-004&time=1468094448063&screen=%7B%22he
[20:01] <arkiver> 3A1050%2C%22width%22%3A1680%7D
[20:01] <arkiver> are still requested through www.coursera.org
[20:01] <arkiver> instead of the wayback machine
[20:01] <arkiver> along with other files that are not requested through the wayback machine
[20:02] <arkiver> webarchiveplayer seems to be requesting everything through webarchiveplayer
[20:02] <PurpleSym> I’ve seen those requests and thought they were sent because something failed before that.
[20:03] <arkiver> might be possible
[20:04] <arkiver> the wayback machine shouldn't request the .js from the original location though
[20:05] <PurpleSym> Of course. There’s also a few requests to cloudfront.net.
[20:05] <PurpleSym> And another one to https://web.archive.org/web/20160627062435/https://class.coursera.org/bigdata-004/data/api/reports/end_of_course_stories.json which does not seem to be in the WARCs we got.
[20:05] <arkiver> Right
[20:06] <arkiver> Strange we didn't get that, before starting the project I made sure we got everything
[20:06] *** Sum has joined #archiveteam-bs
[20:06] <arkiver> not having that URL saved shouldn't be much of a problem though, since the projects I tested did work with the webarchiveplayer
[20:09] *** anjacks0n has joined #archiveteam-bs
[20:21] *** Sum has quit IRC (Ping timeout: 370 seconds)
[20:22] *** Sum has joined #archiveteam-bs
[20:27] *** Start has quit IRC (Quit: Disconnected.)
[20:30] *** Start has joined #archiveteam-bs
[20:42] *** Sum has quit IRC (Ping timeout: 370 seconds)
[20:48] *** Sum has joined #archiveteam-bs
[20:57] *** DiscantX has quit IRC (Ping timeout: 244 seconds)
[20:57] *** ArgyroNet has joined #archiveteam-bs
[20:57] *** metal_cam has joined #archiveteam-bs
[21:00] *** metalcamp has quit IRC (Ping timeout: 244 seconds)
[21:05] *** anjacks0n has quit IRC (anjacks0n)
[21:07] *** metal_cam has quit IRC (Ping timeout: 244 seconds)
[21:24] *** Sum has quit IRC (Ping timeout: 370 seconds)
[21:25] *** Sum has joined #archiveteam-bs
[21:37] *** anjacks0n has joined #archiveteam-bs
[21:47] *** godane has quit IRC (Leaving.)
[22:00] *** Sum has quit IRC (Ping timeout: 370 seconds)
[22:01] *** Sum has joined #archiveteam-bs
[22:16] *** ndiddy has joined #archiveteam-bs
[22:18] *** DoomTay has quit IRC (Quit: Page closed)
[22:22] *** Jeroen52 has joined #archiveteam-bs
[22:27] <luckcolor> here's something interestic for parsung web pages: https://github.com/mozilla/fathom
[22:28] <luckcolor> *inteeresting
[22:28] <luckcolor> **interesting
[22:28] <HCross> parsing as well
[22:29] <luckcolor> OMG lol
[22:29] * HCross hands luckcolor a dictionary
[22:29] <ArgyroNet> that's amazic :)
[22:29] * luckcolor definetely has to change this keyboard
[22:29] <luckcolor> Sigh
[22:31] <ArgyroNet> anyway, it doesn't matter, luckcolor
[22:31] <ArgyroNet> thanks for sharung hue hue hue hue
[22:31] <luckcolor> np
[22:32] <luckcolor> funny thing that will always come to my mind during these times: those irc logs will definetely land on archive.org someday so that my spelling mistakes can be kept indefinetely
[22:32] <luckcolor> oh boy
[22:32] <luckcolor> :P
[22:35] <ArgyroNet> hey, that's not the proper way to think
[22:35] <ArgyroNet> the proper way is "I'll have a proof that I bettered my writing !"
[22:35] <ArgyroNet> :p
[22:39] <ArgyroNet> anyway, ++
[22:39] *** ArgyroNet has quit IRC (Quit: Once you know what cake you want to be true, instinct is a very useful device for enabling you to know that it is)
[23:21] *** Sum has quit IRC (Ping timeout: 370 seconds)
[23:22] *** Sum has joined #archiveteam-bs
[23:22] *** godane has joined #archiveteam-bs
[23:24] *** anjacks0n has quit IRC (anjacks0n)
[23:29] *** Sum has quit IRC (Ping timeout: 370 seconds)
[23:30] *** Sum has joined #archiveteam-bs
[23:33] *** anjacks0n has joined #archiveteam-bs
[23:47] *** anjacks0n has quit IRC (anjacks0n)