[04:02] Morning. [06:09] Archive.org Fund Drive began. [06:09] 3-1 matching, etc [13:33] I see some of you are working on this one here: [13:33] http://archiveteam.org/index.php?title=Gamespy,_1up,_UGO,_IGN [13:33] can I help? [13:34] can I just select a domain, create a WARC and then add finished to it? [13:51] Is it true that this one hasn't even started? [13:51] http://archiveteam.org/index.php?title=Warhammer [13:51] I've started to download it now [13:51] I can do the website itself I think [13:52] but we need a bit more power for the forums... [14:48] acquihire of eBuddy by Booking.com [14:48] (this time it's in the right channel) [14:48] service likely to disappear [15:03] how are you sure booking.com is going to disappear? [15:04] ebuddy will dissapear, not booking.com [15:10] what's the website link of ebuddy? [15:11] ah this one right? [15:11] http://www.ebuddy.com/ [15:11] will put a quick crawl on that webiste... ;) [15:11] website* [15:14] they also have this website: http://xms.me/ [15:14] will do that one too [15:14] and this [15:14] http://www.ebuddyxms.com/ [15:24] ----- [15:24] www.ebuddyxms.com «Finished: FINISHED» 1 launches [15:24] 161 downloaded + 0 queued = 161 total [15:24] 2.2 MiB crawled (2.2 MiB novel, 0 B dupByHash, 0 B notModified) [15:24] ----- [15:24] xms.me «Finished: FINISHED» 1 launches [15:25] 157 downloaded + 0 queued = 157 total [15:25] 2.2 MiB crawled (2.2 MiB novel, 0 B dupByHash, 0 B notModified) [15:25] ----- [17:42] looks like http://www.warhammeronline.com/ might be finished downloading tomorrow [17:42] I need some help though on the forums!!! [18:36] arkiver: do you have a script i can use for the forums? [18:36] there's a couple of wget-warc-lua forum scripts on the archiveteam github [18:36] preferrible pipeline ;-) [18:36] not pipeline [18:36] nope that's the problem [18:36] the forum of warhammer is a subforum of that forum [18:37] so I have no idea how to only download that subforum... [18:37] do I need an account to grab forums ? [18:37] i have the storage if needed [18:37] downloading the whole forum would be a too big job to quickly complete [18:37] yes, I can crawl it too [18:37] but I just don't know how to only crawl that subforum [20:41] SketchCow: Fuck yeah, donated [21:30] http://emergentseas.tumblr.com/robots.txt there are probably a few million tumblrs that block robots [21:32] yahoo did that to a lot of tumblrs after the acquisition [21:33] I could run through every tumblr I know [21:33] then we can tell archivebot to do all of them ;) [21:52] http://ge.tt/blog/17 // http://ge.tt/press/gett-acquired-by-economic-accounting [21:52] fyi [21:56] seems they have a lot of stuff https://encrypted.google.com/search?q=site%3Age.tt [21:56] people still use ge.tt [22:01] ge.tt... [22:01] that rings a bell.. [22:01] I'm not saying they're going away, just that they got acquired [22:03] it's a URL shortener, and that should trigger every AT member's "shit on this website" reflex [22:03] much like yahoo. [22:03] BlueMax: it's not a shortener, it's more like cloudapp [22:04] a file upload service [22:04] cloudapp is cl.ly [22:04] ah. short URL confuddled me [22:25] does anyone have a linode in the NJ datacenter? I have a tumblr script for you to run [22:25] my linode is there but its memory is clogged with wgets [22:47] okay, checking 21M tumblr robots.txt's, should be done in a week [22:49] there will be about 1.25M of these that block all robots [23:40] 2,8G /mnt/archiveteam/wiki/tcrfnet-20131130-wikidump/images [23:41] backup of the cutting floor wiki in progress [23:41] running since 5 days :) [23:57] nico_32: awesome, thanks