Time |
Nickname |
Message |
04:02
π
|
SketchCow |
Morning. |
06:09
π
|
SketchCow |
Archive.org Fund Drive began. |
06:09
π
|
SketchCow |
3-1 matching, etc |
13:33
π
|
arkiver |
I see some of you are working on this one here: |
13:33
π
|
arkiver |
http://archiveteam.org/index.php?title=Gamespy,_1up,_UGO,_IGN |
13:33
π
|
arkiver |
can I help? |
13:34
π
|
arkiver |
can I just select a domain, create a WARC and then add finished to it? |
13:51
π
|
arkiver |
Is it true that this one hasn't even started? |
13:51
π
|
arkiver |
http://archiveteam.org/index.php?title=Warhammer |
13:51
π
|
arkiver |
I've started to download it now |
13:51
π
|
arkiver |
I can do the website itself I think |
13:52
π
|
arkiver |
but we need a bit more power for the forums... |
14:48
π
|
joepie91 |
acquihire of eBuddy by Booking.com |
14:48
π
|
joepie91 |
(this time it's in the right channel) |
14:48
π
|
joepie91 |
service likely to disappear |
15:03
π
|
arkiver |
how are you sure booking.com is going to disappear? |
15:04
π
|
Cameron_D |
ebuddy will dissapear, not booking.com |
15:10
π
|
arkiver |
what's the website link of ebuddy? |
15:11
π
|
arkiver |
ah this one right? |
15:11
π
|
arkiver |
http://www.ebuddy.com/ |
15:11
π
|
arkiver |
will put a quick crawl on that webiste... ;) |
15:11
π
|
arkiver |
website* |
15:14
π
|
arkiver |
they also have this website: http://xms.me/ |
15:14
π
|
arkiver |
will do that one too |
15:14
π
|
arkiver |
and this |
15:14
π
|
arkiver |
http://www.ebuddyxms.com/ |
15:24
π
|
arkiver |
----- |
15:24
π
|
arkiver |
www.ebuddyxms.com ΓΒ«Finished: FINISHEDΓΒ» 1 launches |
15:24
π
|
arkiver |
161 downloaded + 0 queued = 161 total |
15:24
π
|
arkiver |
2.2 MiB crawled (2.2 MiB novel, 0 B dupByHash, 0 B notModified) |
15:24
π
|
arkiver |
----- |
15:24
π
|
arkiver |
xms.me ΓΒ«Finished: FINISHEDΓΒ» 1 launches |
15:25
π
|
arkiver |
157 downloaded + 0 queued = 157 total |
15:25
π
|
arkiver |
2.2 MiB crawled (2.2 MiB novel, 0 B dupByHash, 0 B notModified) |
15:25
π
|
arkiver |
----- |
17:42
π
|
arkiver |
looks like http://www.warhammeronline.com/ might be finished downloading tomorrow |
17:42
π
|
arkiver |
I need some help though on the forums!!! |
18:36
π
|
m1das |
arkiver: do you have a script i can use for the forums? |
18:36
π
|
xmc |
there's a couple of wget-warc-lua forum scripts on the archiveteam github |
18:36
π
|
m1das |
preferrible pipeline ;-) |
18:36
π
|
xmc |
not pipeline |
18:36
π
|
arkiver |
nope that's the problem |
18:36
π
|
arkiver |
the forum of warhammer is a subforum of that forum |
18:37
π
|
arkiver |
so I have no idea how to only download that subforum... |
18:37
π
|
BiggieJon |
do I need an account to grab forums ? |
18:37
π
|
m1das |
i have the storage if needed |
18:37
π
|
arkiver |
downloading the whole forum would be a too big job to quickly complete |
18:37
π
|
arkiver |
yes, I can crawl it too |
18:37
π
|
arkiver |
but I just don't know how to only crawl that subforum |
20:41
π
|
ersi |
SketchCow: Fuck yeah, donated |
21:30
π
|
ivan` |
http://emergentseas.tumblr.com/robots.txt there are probably a few million tumblrs that block robots |
21:32
π
|
balrog |
yahoo did that to a lot of tumblrs after the acquisition |
21:33
π
|
ivan` |
I could run through every tumblr I know |
21:33
π
|
ivan` |
then we can tell archivebot to do all of them ;) |
21:52
π
|
balrog |
http://ge.tt/blog/17 // http://ge.tt/press/gett-acquired-by-economic-accounting |
21:52
π
|
balrog |
fyi |
21:56
π
|
ivan` |
seems they have a lot of stuff https://encrypted.google.com/search?q=site%3Age.tt |
21:56
π
|
balrog |
people still use ge.tt |
22:01
π
|
joepie91 |
ge.tt... |
22:01
π
|
joepie91 |
that rings a bell.. |
22:01
π
|
balrog |
I'm not saying they're going away, just that they got acquired |
22:03
π
|
BlueMax |
it's a URL shortener, and that should trigger every AT member's "shit on this website" reflex |
22:03
π
|
BlueMax |
much like yahoo. |
22:03
π
|
balrog |
BlueMax: it's not a shortener, it's more like cloudapp |
22:04
π
|
balrog |
a file upload service |
22:04
π
|
balrog |
cloudapp is cl.ly |
22:04
π
|
BlueMax |
ah. short URL confuddled me |
22:25
π
|
ivan` |
does anyone have a linode in the NJ datacenter? I have a tumblr script for you to run |
22:25
π
|
ivan` |
my linode is there but its memory is clogged with wgets |
22:47
π
|
ivan` |
okay, checking 21M tumblr robots.txt's, should be done in a week |
22:49
π
|
ivan` |
there will be about 1.25M of these that block all robots |
23:40
π
|
nico_32 |
2,8G /mnt/archiveteam/wiki/tcrfnet-20131130-wikidump/images |
23:41
π
|
nico_32 |
backup of the cutting floor wiki in progress |
23:41
π
|
nico_32 |
running since 5 days :) |
23:57
π
|
ex-parrot |
nico_32: awesome, thanks |