#archiveteam 2013-12-04,Wed

↑back Search

Time Nickname Message
04:02 πŸ”— SketchCow Morning.
06:09 πŸ”— SketchCow Archive.org Fund Drive began.
06:09 πŸ”— SketchCow 3-1 matching, etc
13:33 πŸ”— arkiver I see some of you are working on this one here:
13:33 πŸ”— arkiver http://archiveteam.org/index.php?title=Gamespy,_1up,_UGO,_IGN
13:33 πŸ”— arkiver can I help?
13:34 πŸ”— arkiver can I just select a domain, create a WARC and then add finished to it?
13:51 πŸ”— arkiver Is it true that this one hasn't even started?
13:51 πŸ”— arkiver http://archiveteam.org/index.php?title=Warhammer
13:51 πŸ”— arkiver I've started to download it now
13:51 πŸ”— arkiver I can do the website itself I think
13:52 πŸ”— arkiver but we need a bit more power for the forums...
14:48 πŸ”— joepie91 acquihire of eBuddy by Booking.com
14:48 πŸ”— joepie91 (this time it's in the right channel)
14:48 πŸ”— joepie91 service likely to disappear
15:03 πŸ”— arkiver how are you sure booking.com is going to disappear?
15:04 πŸ”— Cameron_D ebuddy will dissapear, not booking.com
15:10 πŸ”— arkiver what's the website link of ebuddy?
15:11 πŸ”— arkiver ah this one right?
15:11 πŸ”— arkiver http://www.ebuddy.com/
15:11 πŸ”— arkiver will put a quick crawl on that webiste... ;)
15:11 πŸ”— arkiver website*
15:14 πŸ”— arkiver they also have this website: http://xms.me/
15:14 πŸ”— arkiver will do that one too
15:14 πŸ”— arkiver and this
15:14 πŸ”— arkiver http://www.ebuddyxms.com/
15:24 πŸ”— arkiver -----
15:24 πŸ”— arkiver www.ebuddyxms.com «Finished: FINISHED» 1 launches
15:24 πŸ”— arkiver 161 downloaded + 0 queued = 161 total
15:24 πŸ”— arkiver 2.2 MiB crawled (2.2 MiB novel, 0 B dupByHash, 0 B notModified)
15:24 πŸ”— arkiver -----
15:24 πŸ”— arkiver xms.me «Finished: FINISHED» 1 launches
15:25 πŸ”— arkiver 157 downloaded + 0 queued = 157 total
15:25 πŸ”— arkiver 2.2 MiB crawled (2.2 MiB novel, 0 B dupByHash, 0 B notModified)
15:25 πŸ”— arkiver -----
17:42 πŸ”— arkiver looks like http://www.warhammeronline.com/ might be finished downloading tomorrow
17:42 πŸ”— arkiver I need some help though on the forums!!!
18:36 πŸ”— m1das arkiver: do you have a script i can use for the forums?
18:36 πŸ”— xmc there's a couple of wget-warc-lua forum scripts on the archiveteam github
18:36 πŸ”— m1das preferrible pipeline ;-)
18:36 πŸ”— xmc not pipeline
18:36 πŸ”— arkiver nope that's the problem
18:36 πŸ”— arkiver the forum of warhammer is a subforum of that forum
18:37 πŸ”— arkiver so I have no idea how to only download that subforum...
18:37 πŸ”— BiggieJon do I need an account to grab forums ?
18:37 πŸ”— m1das i have the storage if needed
18:37 πŸ”— arkiver downloading the whole forum would be a too big job to quickly complete
18:37 πŸ”— arkiver yes, I can crawl it too
18:37 πŸ”— arkiver but I just don't know how to only crawl that subforum
20:41 πŸ”— ersi SketchCow: Fuck yeah, donated
21:30 πŸ”— ivan` http://emergentseas.tumblr.com/robots.txt there are probably a few million tumblrs that block robots
21:32 πŸ”— balrog yahoo did that to a lot of tumblrs after the acquisition
21:33 πŸ”— ivan` I could run through every tumblr I know
21:33 πŸ”— ivan` then we can tell archivebot to do all of them ;)
21:52 πŸ”— balrog http://ge.tt/blog/17 // http://ge.tt/press/gett-acquired-by-economic-accounting
21:52 πŸ”— balrog fyi
21:56 πŸ”— ivan` seems they have a lot of stuff https://encrypted.google.com/search?q=site%3Age.tt
21:56 πŸ”— balrog people still use ge.tt
22:01 πŸ”— joepie91 ge.tt...
22:01 πŸ”— joepie91 that rings a bell..
22:01 πŸ”— balrog I'm not saying they're going away, just that they got acquired
22:03 πŸ”— BlueMax it's a URL shortener, and that should trigger every AT member's "shit on this website" reflex
22:03 πŸ”— BlueMax much like yahoo.
22:03 πŸ”— balrog BlueMax: it's not a shortener, it's more like cloudapp
22:04 πŸ”— balrog a file upload service
22:04 πŸ”— balrog cloudapp is cl.ly
22:04 πŸ”— BlueMax ah. short URL confuddled me
22:25 πŸ”— ivan` does anyone have a linode in the NJ datacenter? I have a tumblr script for you to run
22:25 πŸ”— ivan` my linode is there but its memory is clogged with wgets
22:47 πŸ”— ivan` okay, checking 21M tumblr robots.txt's, should be done in a week
22:49 πŸ”— ivan` there will be about 1.25M of these that block all robots
23:40 πŸ”— nico_32 2,8G /mnt/archiveteam/wiki/tcrfnet-20131130-wikidump/images
23:41 πŸ”— nico_32 backup of the cutting floor wiki in progress
23:41 πŸ”— nico_32 running since 5 days :)
23:57 πŸ”— ex-parrot nico_32: awesome, thanks

irclogger-viewer