[00:13] Oh yeah, and fuck archive.org. Only losers add to that service. That service should be under my command. I'd love for your leader, Jason Scott, to get tortured in many pornographic ways. Or "rape torture". I'd take the front seat to watching. DFJustin, Ivan, and XMC will also have terrible genatalia torture. [01:39] hey I finally rate [05:03] well that seemed like a charming guy [05:03] I must surely have some good points [05:03] *he [05:03] damnit.... [07:10] SketchCow: point me where metadata is needdd... [09:20] I am getting "bash: ./get-wget-lua.sh: Permission denied" when i try an clone the halo-grab [09:23] signius: try chmod +x get-wget-lua.sh [09:24] then try again [09:24] (happens sometimes, it's not just you) [09:25] antomatic, lol cheers, i was just changing the permissions manually as you said that :) [09:27] sigh ok no items available for qwiki or halo [11:52] Smiley: you'll want to email /msg SketchCow with your email address so he can give you access to the arcade collection [12:20] i'll write something to archive mapillary.com in winter, their "2048px" images would be around 2TB from a quick glance. [12:36] just 2TB? [13:22] midas: ~5 million photos of ~500 kilobytes each [14:15] damn [14:58] Did I miss why archiving TwitPic has stopped? They will shut down the 25th right? [14:59] twitpic is completed if im not mistaken [15:00] Oh cool [15:00] Expected more than 10 million pictures to be honest [15:01] aren't they actively blocking us? [15:06] I've been running my crawlers on high concurrency and haven't seen any blocking, balrog [15:06] Only some 503 errors [15:06] Compresse: they were earlier [15:06] Yes, I did join late. [15:07] But if I look at they leaderboard now, nothing is happening [15:07] Although it says 0 to go so I guess @balrog is right [15:07] interesting, apparently i got banned from twitpic.com again.. was unbanned for a while :D [15:55] Twitpic blocked us [15:55] We got a set of it. [15:56] Hundreds of millions will disappear, hundreds of millions gotten. [15:56] I am speaking at an archiving conference on Sunday. [15:56] 90% of my talk will be twitpic [16:14] i hope there will it be variations of the words ('twit','pic','shit','twat','pig','tit','quit') [16:20] Do more IPs help? [16:24] SketchCow: there should be a 'jason gets mad at people on stage' page on the wiki [16:25] SketchCow: i hope there is a video recording of that talk [17:42] So, today's day is me basically pushing in descriptions for 1,130 collections on Internet Archive [17:49] uploaded: https://archive.org/details/aol-file-protocol-4400-3501-to-3600 [18:05] a lot of text files in this batch (and in previous batches) [18:08] would be nice to see a textfiles.com update someday [19:46] [21:44] [19:56:17] 90% of my talk will be twitpic [19:46] this means "90% of my talk will be verbally violating Twitpic and anybody associated with the decision to block us", yes? [19:47] also, midas: that's the "public speaking" page, I think :) [20:02] project code is out of date & needs to be upgraded [20:02] is there a simple way to update the code other than a git clone & recompiling ? [20:02] git pull [20:03] any chance of badmouthing imageshack as well [20:05] I see there was some lack of information about twitpic. Let me summarize what I know. [20:06] Some days ago twitpic hid the pictures, so now we're getting only the HTML with some of the comments. [20:06] But that's far not done. We've scraped about ~40% of them, there are ~12 million items left (12m*36 pages), they are loaded in ~1.5 million batches, because of limited memory of the tracker. (We won't finish till 25th October.) [20:06] Actually it was a memory issue and that the admin was away, why it was stopped for several hours today. [20:06] Regarding the actual pictures, ~500 million was downloaded by Kenshin some weeks ago. There are about ~300 millions more, which we have no access to. [20:06] IRC channel #quitpic. [20:06] End of report [20:07] bzc6p_: you missed the part where Twitpic are dicks [20:07] :) [20:07] I thought everyone knows that :) [20:07] I seriously propose we change the wiki password to 'noaheverettisanobshiner' [20:08] or possibly knobshiner, for accuracy [20:08] yahoosucks is more punchy [20:08] everettsucks [20:08] ok i just ran "su -c "cd /home/archiveteam; git clone https://github.com/ArchiveTeam/twitpic-grab.git; cd twitpic-grab; ./get-wget-lua.sh" archiveteam" & its still telling me the project code is out of date when i run it [20:09] sunshineanddollars [20:09] that sounds like a misconfigured tracker? [20:10] signius: paste output of 'pwd' here [20:10] So, for newcomers: export sucks; Noah Everett is radio silent, except when misleading with shutdown or won't-shut-down notices; several people offered him to pay the bandwidth cost to backup everything but Noah disappeared [20:11] Of course we'll all have to change our tune when Noah's picture is on the front page of CNN, handing over the hard discs to the library of congress or something. [20:12] antomatic: I am okay with eating my words if that is the tradeoff [20:12] signius: you here? [20:12] joepie91: I think we can be confident that this will never happen, sadly [20:13] Kazzy, yeah sorry just got multiple screen im switching between [20:13] :( [20:13] Well, I wouldn't be surprised on anything after this all... but paranoia is one of our basic traits and this time we have a reson for that [20:13] paste output of 'pwd' here [20:14] /home/archiveteam [20:14] pwd [20:14] cd twitpic-grab [20:14] run-pipeline etc [20:16] hahaha, blocked by Ello [20:19] Ha ha, he just declared war on me. [20:19] SketchCow: ? [20:19] Kazzy, http://fpaste.org/ [20:19] Ello guy. [20:19] Why would they block you after you were so nice to them? ): [20:19] I think we consider archiving them. [20:19] er, :) [20:20] Started a project channel: #oodbye [20:20] SketchCow: something publicly readable? [20:20] I have a spare bucket of popcorn on my desk [20:20] His twitter is #cacheflowe [20:21] But he's blocked me. I can't retweet him anymore, or anything. [20:21] awww [20:22] cacheflowe: @textfiles If you spoke to me like a respectful adult, instead of starting the conversation calling us a "Roach Motel" it would be different [20:23] Tattoo on chest [20:23] Haha! [20:24] It's really late in the day to be shocked by my tactics in here. [20:24] (For the record) [20:25] https://twitter.com/textfiles/status/525377284958855170 is my tweet of the day [20:25] Devil's advocate - and purely out of interest (I'm not taking either position) - is he right? [20:26] is he right about what? [20:27] SketchCow: from channel #quitpic [20:27] i wonder who's behind https://twitter.com/TwitPicSupport/with_replies [20:27] if someone could figure that out maybe there would be way negotiate [20:27] I saw [20:27] Ok, sorry then. Just wanted to make sure. [20:27] Is he right in being bruised that the conversation apparently started with disrespect which made him discinclined to co-operate with something which he might otherwise have been receptive to? [20:28] I am sometimes distracted. [20:28] Devil's advocate. [20:28] * antomatic dons fireproof suit. [20:28] You know what I hate? [20:28] Advocatingfor the Devil [20:28] He's the fuckin' Devil for a reason [20:28] antomatic: L o L [20:28] Did he run over your dog or something? [20:28] worse [20:28] Oh yeah, I always forget this about you. [20:28] my cat [20:28] If someone gets a bit butthurt over being called on destroying data, so be it [20:29] Here is what goes on. [20:29] I always confuse you with arkiver. [20:29] As long as someone gets some saving going on, fuck it all [20:29] Similar names, sleep patters. [20:29] I like arkiver. [20:29] I think I'm offended. :) [20:29] now that you mention it, i also confuse antomatic and arkiver all the time [20:29] im not sure if he really blocked you, twitter is fucking broken [20:30] ok delete the /home/archiveteam/twitpic-grab directory & re cloned it and its working now so fuck knows [20:30] So I always forgive arkiver that, for some reason, he occasionally looks left and right as Archiveteam does another bombing raid and goes "uh, guys? Aren't we... aren't we being a little rough here?" [20:30] yeah twitter is down for me [20:30] I have been here at least twice as long as arkiver! :) [20:30] Arkiver is better to deal with. [20:30] #######bbbbbbbssssssssss [20:30] QUICK GET THE DEFIBRILLATOR FOR SCHBIRID [20:30] I'm genuinely surprised to even be on anyone's radar, though. And as I say, no judgement, I was just interested. No biggie. [20:30] Also he is right [20:32] signius: "git clone" must be run from outside, "git pull" from inside the directory [20:33] bzc6p, yeah i was inside when trying to do a git pull but it was bitching about the branch [20:33] i got it sorted now though [20:33] However, for me those commands are a bit complicated. I know those are in the guide, but I don't create a dedicated user [20:36] The commands are a bit complicated because that's why we made the vm. [20:39] ok "git pull && ./get-wget-lau.sh" worked fine on the other box so something got its panties in a bunch on the first box its all running fine now, but thanks for the help [20:41] ello got $5.5mil, man we could archive the internet with that. [20:45] I love his "we haven't had time" [20:46] No, I'm just watching him shove his own foot in his mouth and every time someone says something he points at it and says "mmmmggjgjggmmm foootph" [20:46] ill wait for the heartattack when he finds the bandwidth bill [20:47] did noaheverett ever responded to your tweets? [20:47] http://i.imgur.com/CrsoJbL.gifv [20:47] FOS currently [20:48] mmm internets [21:20] SketchCow: you are linking teh evils [21:20] :) [21:22] [22:29] now that you mention it, i also confuse antomatic and arkiver all the time [21:22] wat [21:22] * joepie91 is not sure what the similarities, and only confuses arkiver with arkhive for tabcomplete reasons [21:24] I wonder if I should change my name [21:24] arkomatic? [21:24] not_arkiver [21:24] something with a Z, perhaps. Or an X. Xes are cool. [21:24] change it to arkivermatic [21:24] :) [21:24] I think arkomatic would suffice [21:25] Kazzy: haha, implying efnet would allow such a horrendously long nickname [21:25] :) [21:25] i don't even understand why there's a limit of 9 chars, damn [21:26] because EFNet is Oldnet [21:26] ~arkiver [21:27] aaaarkiver [21:33] woop woop woop off-topic siren [21:59] So Archive.org will take shovelware/shareware CDs? [22:00] yes [22:01] Even if they are still copyrighted [22:01] Though, I guess some of these compilers may be out of business [22:01] it'll be fine [22:01] I noticed there was a full copy of Corel DRAW 1 in one of your archives [22:01] How does Archive.org see "abandonware" like that [22:02] it'll be fine [22:02] I mean it's technically warez, isn't it? [22:02] TFGBD_: the right holders can file DMCA complaints [22:02] That sucks [22:02] Just like YouTube [22:02] I guess it's best to keep that stuff on the DL [22:02] i think archive.org will safe-keep the stuff for when the copyright expires [22:03] That's good too [22:03] yes. ia does not delete things. i would like to close this topic. [22:03] lets hope they survive that long [22:03] we have an faq, isn't this covered? [22:03] http://archiveteam.org/index.php?title=FAQ [22:04] TFGBD_: just upload it, don't go shouting "GET YOUR WAREZ HERE!!1!" off the rooftops, and all should be fine - at worst it'll be made publicly unavailable and a darked copy continues to exist [22:12] DID I MISS ANYTHING [22:15] free warez. all very hush-hush. [22:17] Taking Corel Draw 1 from my cold, dead, ink-stained fingers [22:20] like zero, zero, zero-zero-zero-day software. [22:20] possibly with a 1 on the front. [22:29] lol [22:29] jason check privmsg [22:30] come out 2nite [23:20] http://archive.org/editxml/geoworldmagazine [23:22] http://www.theglobeandmail.com/technology/digital-culture/the-race-to-archive-twitpic-before-800-million-pictures-vanish/article21199755/ [23:26] Archive.org barely seems to screen the stuff. [23:27] Like, noticed a bunch of fake submissions of "Windows Media Player 10" that were likely just virus laden links to some website [23:28] IA does not inspect all uploads [23:33] they must have tons of bandwidth [23:34] they have several 10gbit links