#archiveteam 2015-03-30,Mon

↑back Search

Time Nickname Message
00:09 🔗 chfoo arkiver: ok, added.
01:00 🔗 kyan DFJustin, also there are groups, with group forums and group comments
01:03 🔗 Ravenloft has joined #archiveteam
01:15 🔗 xtr-201 has joined #archiveteam
01:36 🔗 GLaDOS has quit IRC (Ping timeout: 512 seconds)
01:56 🔗 Kazzy has quit IRC (Read error: Operation timed out)
01:56 🔗 Kazzy has joined #archiveteam
02:07 🔗 signius has quit IRC (Read error: Operation timed out)
02:16 🔗 Ymgve has quit IRC ()
02:18 🔗 SimpBrain has quit IRC (Read error: Connection reset by peer)
02:21 🔗 signius has joined #archiveteam
02:22 🔗 bmcginty Question. Doe sArchiveTeam ever use backend apis to grab content from failing sites, or is it restricted to typical webpages?
02:27 🔗 chfoo for hyves, all the ajax calls were manually made to save all the photos which is why every archived page has the word "deadbeef" in it
02:28 🔗 chfoo for viddler, reverse engineering was done and they got angry.
02:30 🔗 bmcginty chfoo: ah. thanks.
02:42 🔗 SketchCow http://teamarchive1.fnf.archive.org/DELETE-SCREENBIN/alanwood.net-inf-20140114-090506-5jxkt.warc.gz.png
02:42 🔗 SketchCow Delicious
02:43 🔗 SketchCow It's nice when it just works, although it does not often just work.
02:52 🔗 mistym has quit IRC (Remote host closed the connection)
02:53 🔗 mistym has joined #archiveteam
03:15 🔗 khaoohs has joined #archiveteam
03:30 🔗 GLaDOS has joined #archiveteam
03:35 🔗 JMC has joined #archiveteam
03:36 🔗 dashcloud has quit IRC (Read error: Operation timed out)
03:40 🔗 dashcloud has joined #archiveteam
03:42 🔗 GLaDOS has quit IRC (Ping timeout: 260 seconds)
03:45 🔗 GLaDOS has joined #archiveteam
04:08 🔗 aaaaaaaaa has quit IRC (Leaving)
04:26 🔗 wp494 has quit IRC (LOUD UNNECESSARY QUIT MESSAGES)
04:48 🔗 Mayonaise has quit IRC (Ping timeout: 512 seconds)
05:13 🔗 mistym has quit IRC (Remote host closed the connection)
05:14 🔗 mistym has joined #archiveteam
05:27 🔗 mistym has quit IRC (Remote host closed the connection)
05:28 🔗 wp494 has joined #archiveteam
05:54 🔗 Mayonaise has joined #archiveteam
05:55 🔗 primus104 has joined #archiveteam
05:56 🔗 mistym has joined #archiveteam
06:08 🔗 Mayonaise has quit IRC (Ping timeout: 512 seconds)
06:17 🔗 Mayonaise has joined #archiveteam
07:06 🔗 mistym has quit IRC (Read error: Operation timed out)
07:17 🔗 [Beta]_ has joined #archiveteam
07:18 🔗 [Beta] has quit IRC (Ping timeout: 240 seconds)
07:19 🔗 robink has quit IRC (Quit: No Ping reply in 180 seconds.)
07:20 🔗 robink has joined #archiveteam
07:39 🔗 [Beta]_ is now known as [Beta]
07:51 🔗 schbirid has joined #archiveteam
07:53 🔗 SimpBrain has joined #archiveteam
08:03 🔗 dashcloud has quit IRC (Read error: Operation timed out)
08:06 🔗 dashcloud has joined #archiveteam
09:10 🔗 primus104 has quit IRC (Leaving.)
09:23 🔗 BlueMaxim hey, I just learnt that the Neverwinter Vault (an IGN hosted site that hosted game addons for Neverwinter Nights 1 & 2) went down a few months ago and never came back, but some people not only got a full copy of the website but actually put up a mirror of it at ftp://neverwintervault.org/rolovault/ - it might be worth feeding it into the Archive somehow. don't have the hardware to do it myself so I'd just put it here to
09:23 🔗 BlueMaxim see if someone would like to do it
09:47 🔗 Start has quit IRC (Ping timeout: 740 seconds)
10:06 🔗 Start has joined #archiveteam
10:11 🔗 sky__ has joined #archiveteam
10:11 🔗 sky__ has left
10:44 🔗 habi has joined #archiveteam
10:49 🔗 svchfoo2 has quit IRC (Remote host closed the connection)
10:52 🔗 svchfoo2 has joined #archiveteam
10:53 🔗 svchfoo1 sets mode: +o svchfoo2
10:56 🔗 habi has left
11:04 🔗 BlueMaxim has quit IRC (Quit: Leaving)
11:42 🔗 Ymgve has joined #archiveteam
11:58 🔗 Wolfie has joined #archiveteam
12:42 🔗 primus104 has joined #archiveteam
12:43 🔗 sankin has joined #archiveteam
12:54 🔗 johtso is anyone able to upload to IA with the s3-like api at the moment?
12:56 🔗 johtso I worried it's my api keys that have been blocked, unless everyone is getting 503
13:03 🔗 brayden has quit IRC (Quit: Leaving)
13:11 🔗 johtso Don't seem to be able to upload through the web interface either.
13:30 🔗 Nemo_bis johtso: what sort of requests are you making?
13:31 🔗 johtso Nemo_bis, just normal upload requests, using the ia tool, and using the upload tool on the website
13:31 🔗 johtso unless there's some kind of site wide outage I assume my account got blocked when a was making a bunch of parallel uploads..
13:31 🔗 brayden has joined #archiveteam
13:31 🔗 Nemo_bis ...which is not what I call normal upload requests
13:32 🔗 Nemo_bis Yes, it's possible to get uploads refused if you do many at once. Even with a single upload thread doing many small uploads.
13:32 🔗 johtso right, I'm but now I'm just making single requests
13:33 🔗 Nemo_bis In my experience it happens when I have hundreds of green rows in https://archive.org/catalog.php?history=1&justme=1
13:33 🔗 johtso Nemo_bis, that's exactly my situation!
13:34 🔗 johtso that's a hard disk issue
13:34 🔗 johtso unfortunate that it's stops you from doing any more uploads until it's sorted..
13:35 🔗 primus104 has quit IRC (Leaving.)
13:35 🔗 johtso wow, what's happened to my typing today..
14:14 🔗 Melissa_ has joined #archiveteam
14:14 🔗 Melissa_ Hi guys
14:14 🔗 Melissa_ Can I request a website for archiving?
14:17 🔗 johtso Melissa_, sure! what's the site?
14:17 🔗 Melissa_ dbtropes.org
14:17 🔗 Melissa_ It's a Linked Data wrapper for TV Tropes
14:18 🔗 Melissa_ Which allows one to perform way better searches
14:18 🔗 Melissa_ I contacted the creator of it, asking a question, but he has not responded and ever since access to the website has been revoked. Just try accessing it now
14:19 🔗 Melissa_ I'm starting to think he didn't want it to be public
14:19 🔗 Melissa_ I do have another URL which can still be accessed, and maybe it's just a matter of time before he revokes access to that URL as well
14:19 🔗 Melissa_ I'm not sure of this, though. Maybe this issue is just temporary
14:20 🔗 Melissa_ You can read more about dbtopres here: https://web.archive.org/web/20140722081335/http://skipforward.opendfki.de/wiki/DBTropes
14:21 🔗 johtso Melissa_, if you give me the URL I can try pointing archivebot at it.
14:21 🔗 Melissa_ Yeah. There's just one problem: dbtropes isn't just the pages, but also a search engine, which is a back-end thing
14:21 🔗 Melissa_ Anything's better than nothing, though
14:23 🔗 toad1 has quit IRC (Read error: Operation timed out)
14:24 🔗 Melissa_ Actually, nevermind
14:24 🔗 Melissa_ Just found a zip of the whole database
14:30 🔗 johtso nice :)
14:30 🔗 johtso Melissa_, want me to archivebot the database dump?
14:39 🔗 mistym has joined #archiveteam
14:40 🔗 godane SketchCow: i'm uploading CD3WD dvds to your ftp
14:41 🔗 godane i figure it would be easier to upload to a ftp since its very big and i can resume if my wifi acts up
14:41 🔗 SketchCow Great
14:47 🔗 mistym has quit IRC (Remote host closed the connection)
14:53 🔗 wp494 has quit IRC (LOUD UNNECESSARY QUIT MESSAGES)
14:54 🔗 balrog Melissa_: it works for me
14:54 🔗 balrog and they provide a database snapshot
14:54 🔗 balrog it's weird that you can't access it...
14:55 🔗 Melissa_ balrog yeah, looks like I've been worrying about nothing. The issue was client-sided
14:57 🔗 Melissa_ Very strange but the issue is gone now that I've removed cookies, site prefs, etc.
14:57 🔗 Wolfie has quit IRC (Quit: Leaving.)
14:58 🔗 johtso how can you tell if archivebot has properly scraped something? where can you see the actual total data downloaded?
14:58 🔗 johtso would it be this bit at the end of the job? "sent 325 bytes received 30 bytes 236.67 bytes/sec"
14:58 🔗 johtso and if so, how would you "try again"?
15:12 🔗 Wolfie has joined #archiveteam
15:28 🔗 Melissa_ has quit IRC (Quit: Leaving)
15:36 🔗 Start rapidshare dies tomorrow, we need more people doing rapidshare discovery
15:36 🔗 Start #rapidscare
16:08 🔗 mistym has joined #archiveteam
16:08 🔗 mistym_ has joined #archiveteam
16:12 🔗 mistym has quit IRC (Client Quit)
16:13 🔗 Emcy has quit IRC (Read error: Connection reset by peer)
16:24 🔗 mistym has joined #archiveteam
16:26 🔗 mistym has quit IRC (Client Quit)
16:30 🔗 mistym has joined #archiveteam
16:32 🔗 mistym_ has quit IRC (Leaving...)
16:35 🔗 schbirid Start: ask again when the tracker is actually servign jobs :P
16:35 🔗 Start it was when i asked
16:36 🔗 schbirid :)
16:40 🔗 patricko- is now known as patrickod
16:43 🔗 john has quit IRC (Remote host closed the connection)
16:44 🔗 philpem has joined #archiveteam
16:48 🔗 philpem_ has joined #archiveteam
16:48 🔗 philpem_ has quit IRC (Remote host closed the connection)
16:49 🔗 yipdw johtso: search here -> http://archive.fart.website/archivebot/viewer/
16:49 🔗 yipdw job results are typically uploaded in 5 GB chunks; that index updates periodically
16:53 🔗 aaaaaaaaa has joined #archiveteam
17:07 🔗 patrickod is now known as patricko-
17:09 🔗 primus104 has joined #archiveteam
17:14 🔗 Wolfie Is there a good way to pull sites out of the wayback machine?
17:15 🔗 yipdw Wolfie: if the site was injected via ArchiveBot then the WARCs can be located via http://archive.fart.website/archivebot/viewer/
17:15 🔗 yipdw if it came from the Archive-It group then you may be able to ask them
17:15 🔗 yipdw for other cases no public access presently exists
17:17 🔗 Wolfie Well goddamnit.
17:22 🔗 johtso yipdw, do you know what that output I pasted from the logs refers to?
17:23 🔗 kyan I always assumed that was the json file
17:23 🔗 kyan I think those are around that big
17:24 🔗 johtso ah, that's not so useful :)
17:25 🔗 johtso is there any way to bypass the delay before being able to re-archive?
17:25 🔗 kyan !expire ident
17:25 🔗 johtso ah, nice
17:25 🔗 johtso does !abort do anything if the job is already done?
17:25 🔗 kyan not as far as I know
17:26 🔗 kyan but yeah I think the json file is the only thing that gets rsynced at the end of a job now, all the warcs I think get sent in the background while the job is running (or something like that?)
17:26 🔗 kyan I'm no expert
17:27 🔗 johtso any way to nuke a completed archive?
17:27 🔗 SketchCow https://archive.org/details/archiveteam_madden/v2
17:28 🔗 kyan maybe by nuking the IA data centers. Please don't do that...
17:31 🔗 yipdw johtso: it's the JSON file
17:31 🔗 johtso makes sense
17:31 🔗 yipdw RsyncUpload exists as a catch-all for materials not delegated to the uploader
17:32 🔗 yipdw further discussion of archivebot behavior in #archivebot
17:32 🔗 * johtso enters missile launch codes
17:32 🔗 johtso I okay, i'll keep the discussion to #archivebot .. just that it gets kind of swamped by the bot output
17:46 🔗 SketchCow johtso, you talk wayyyy too much
17:46 🔗 SketchCow But I think you struck something there.
17:46 🔗 SketchCow Hence: #archivebot-bs
17:46 🔗 johtso SketchCow, hahaha
17:51 🔗 caber has quit IRC (Quit: Doei Doei!!!)
17:57 🔗 caber has joined #archiveteam
18:08 🔗 Stiletto this was posted on one of my forums, not sure if it's a repack of something already on the Archive or instead something that should be preserved in the Magazines collection: http://www.pixsoriginadventures.co.uk/PCZone/
18:08 🔗 Stiletto there's some overlap with https://archive.org/details/pczonemagazine?sort=-date but it has some issues that aren't on the archive...
18:16 🔗 SketchCow Does someone want to run comparisons or should I do it.
18:16 🔗 SketchCow I suppose I could do it quick (download all, compare)
19:14 🔗 schbirid gwern subpoenad http://www.reddit.com/r/DarkNetMarkets/comments/30tudk/psa_5_reddit_accounts_subpoenaed_by_ice/
19:14 🔗 schbirid is http://www.gwern.net/ well archived?
19:21 🔗 Emcy has joined #archiveteam
19:29 🔗 mistym has quit IRC (Leaving)
19:43 🔗 mistym has joined #archiveteam
19:45 🔗 midas not sure, just to be sure archivebotted
19:48 🔗 SN4T14_ has joined #archiveteam
19:49 🔗 mistym has quit IRC (Remote host closed the connection)
19:53 🔗 mistym has joined #archiveteam
19:56 🔗 SN4T14__ has quit IRC (Ping timeout: 512 seconds)
20:03 🔗 Emcy_ has joined #archiveteam
20:07 🔗 Emcy has quit IRC (Ping timeout: 512 seconds)
20:26 🔗 mistym has quit IRC (Remote host closed the connection)
20:31 🔗 * schbirid archivebots midas' certain harddisk
20:31 🔗 midas LALALALA
20:39 🔗 mistym has joined #archiveteam
20:50 🔗 Emcy_ has quit IRC (Read error: Connection reset by peer)
20:52 🔗 signius has quit IRC (Read error: Operation timed out)
20:58 🔗 sankin has quit IRC (Leaving.)
21:01 🔗 mistym has quit IRC (Quit: Leaving)
21:04 🔗 Ravenloft has quit IRC (Ping timeout: 240 seconds)
21:06 🔗 signius has joined #archiveteam
21:18 🔗 SimpBrain has quit IRC (Read error: Connection reset by peer)
21:19 🔗 Emcy has joined #archiveteam
21:39 🔗 wp494 has joined #archiveteam
21:51 🔗 Stiletto has quit IRC (Ping timeout: 186 seconds)
21:51 🔗 BlueMaxim has joined #archiveteam
21:51 🔗 appledash has joined #archiveteam
21:52 🔗 appledash Hey, is there an easy way / a guide for getting the warrior image running under ESXi?
21:54 🔗 Emcy_ has joined #archiveteam
21:55 🔗 Emcy has quit IRC (Ping timeout: 512 seconds)
21:58 🔗 appledash The issue I mainly forsee is the fact that I need to assign a static IP
21:59 🔗 khaoohs_ has joined #archiveteam
21:59 🔗 khaoohs has quit IRC (Read error: Connection reset by peer)
22:11 🔗 mistym has joined #archiveteam
22:15 🔗 khaoohs_ has quit IRC (Read error: Connection reset by peer)
22:15 🔗 khaoohs has joined #archiveteam
22:16 🔗 dashcloud here you go: http://edvoncken.net/2014/08/archiveteam-warrior-on-esxi/
22:16 🔗 dashcloud I haven't tried it, but if it works, please add it to the wiki
22:29 🔗 philpem has quit IRC (Ping timeout: 260 seconds)
22:46 🔗 wp494 has quit IRC (LOUD UNNECESSARY QUIT MESSAGES)
22:50 🔗 NovaKing_ has quit IRC (Read error: Operation timed out)
22:51 🔗 NovaKing_ has joined #archiveteam
22:53 🔗 wp494 has joined #archiveteam
23:01 🔗 NovaKing_ has quit IRC (Read error: Operation timed out)
23:02 🔗 NovaKing_ has joined #archiveteam
23:11 🔗 dashcloud has quit IRC (Read error: Operation timed out)
23:22 🔗 aschmitz has quit IRC (Read error: Connection reset by peer)
23:25 🔗 dashcloud has joined #archiveteam
23:27 🔗 Wolfie has left
23:44 🔗 Stiletto has joined #archiveteam
23:52 🔗 BlueMaxim has quit IRC (Quit: Leaving)
23:52 🔗 toad has joined #archiveteam

irclogger-viewer