[00:09] arkiver: ok, added. [01:00] DFJustin, also there are groups, with group forums and group comments [01:03] *** Ravenloft has joined #archiveteam [01:15] *** xtr-201 has joined #archiveteam [01:36] *** GLaDOS has quit IRC (Ping timeout: 512 seconds) [01:56] *** Kazzy has quit IRC (Read error: Operation timed out) [01:56] *** Kazzy has joined #archiveteam [02:07] *** signius has quit IRC (Read error: Operation timed out) [02:16] *** Ymgve has quit IRC () [02:18] *** SimpBrain has quit IRC (Read error: Connection reset by peer) [02:21] *** signius has joined #archiveteam [02:22] Question. Doe sArchiveTeam ever use backend apis to grab content from failing sites, or is it restricted to typical webpages? [02:27] for hyves, all the ajax calls were manually made to save all the photos which is why every archived page has the word "deadbeef" in it [02:28] for viddler, reverse engineering was done and they got angry. [02:30] chfoo: ah. thanks. [02:42] http://teamarchive1.fnf.archive.org/DELETE-SCREENBIN/alanwood.net-inf-20140114-090506-5jxkt.warc.gz.png [02:42] Delicious [02:43] It's nice when it just works, although it does not often just work. [02:52] *** mistym has quit IRC (Remote host closed the connection) [02:53] *** mistym has joined #archiveteam [03:15] *** khaoohs has joined #archiveteam [03:30] *** GLaDOS has joined #archiveteam [03:35] *** JMC has joined #archiveteam [03:36] *** dashcloud has quit IRC (Read error: Operation timed out) [03:40] *** dashcloud has joined #archiveteam [03:42] *** GLaDOS has quit IRC (Ping timeout: 260 seconds) [03:45] *** GLaDOS has joined #archiveteam [04:08] *** aaaaaaaaa has quit IRC (Leaving) [04:26] *** wp494 has quit IRC (LOUD UNNECESSARY QUIT MESSAGES) [04:48] *** Mayonaise has quit IRC (Ping timeout: 512 seconds) [05:13] *** mistym has quit IRC (Remote host closed the connection) [05:14] *** mistym has joined #archiveteam [05:27] *** mistym has quit IRC (Remote host closed the connection) [05:28] *** wp494 has joined #archiveteam [05:54] *** Mayonaise has joined #archiveteam [05:55] *** primus104 has joined #archiveteam [05:56] *** mistym has joined #archiveteam [06:08] *** Mayonaise has quit IRC (Ping timeout: 512 seconds) [06:17] *** Mayonaise has joined #archiveteam [07:06] *** mistym has quit IRC (Read error: Operation timed out) [07:17] *** [Beta]_ has joined #archiveteam [07:18] *** [Beta] has quit IRC (Ping timeout: 240 seconds) [07:19] *** robink has quit IRC (Quit: No Ping reply in 180 seconds.) [07:20] *** robink has joined #archiveteam [07:39] *** [Beta]_ is now known as [Beta] [07:51] *** schbirid has joined #archiveteam [07:53] *** SimpBrain has joined #archiveteam [08:03] *** dashcloud has quit IRC (Read error: Operation timed out) [08:06] *** dashcloud has joined #archiveteam [09:10] *** primus104 has quit IRC (Leaving.) [09:23] hey, I just learnt that the Neverwinter Vault (an IGN hosted site that hosted game addons for Neverwinter Nights 1 & 2) went down a few months ago and never came back, but some people not only got a full copy of the website but actually put up a mirror of it at ftp://neverwintervault.org/rolovault/ - it might be worth feeding it into the Archive somehow. don't have the hardware to do it myself so I'd just put it here to [09:23] see if someone would like to do it [09:47] *** Start has quit IRC (Ping timeout: 740 seconds) [10:06] *** Start has joined #archiveteam [10:11] *** sky__ has joined #archiveteam [10:11] *** sky__ has left [10:44] *** habi has joined #archiveteam [10:49] *** svchfoo2 has quit IRC (Remote host closed the connection) [10:52] *** svchfoo2 has joined #archiveteam [10:53] *** svchfoo1 sets mode: +o svchfoo2 [10:56] *** habi has left [11:04] *** BlueMaxim has quit IRC (Quit: Leaving) [11:42] *** Ymgve has joined #archiveteam [11:58] *** Wolfie has joined #archiveteam [12:42] *** primus104 has joined #archiveteam [12:43] *** sankin has joined #archiveteam [12:54] is anyone able to upload to IA with the s3-like api at the moment? [12:56] I worried it's my api keys that have been blocked, unless everyone is getting 503 [13:03] *** brayden has quit IRC (Quit: Leaving) [13:11] Don't seem to be able to upload through the web interface either. [13:30] johtso: what sort of requests are you making? [13:31] Nemo_bis, just normal upload requests, using the ia tool, and using the upload tool on the website [13:31] unless there's some kind of site wide outage I assume my account got blocked when a was making a bunch of parallel uploads.. [13:31] *** brayden has joined #archiveteam [13:31] ...which is not what I call normal upload requests [13:32] Yes, it's possible to get uploads refused if you do many at once. Even with a single upload thread doing many small uploads. [13:32] right, I'm but now I'm just making single requests [13:33] In my experience it happens when I have hundreds of green rows in https://archive.org/catalog.php?history=1&justme=1 [13:33] Nemo_bis, that's exactly my situation! [13:34] that's a hard disk issue [13:34] unfortunate that it's stops you from doing any more uploads until it's sorted.. [13:35] *** primus104 has quit IRC (Leaving.) [13:35] wow, what's happened to my typing today.. [14:14] *** Melissa_ has joined #archiveteam [14:14] Hi guys [14:14] Can I request a website for archiving? [14:17] Melissa_, sure! what's the site? [14:17] dbtropes.org [14:17] It's a Linked Data wrapper for TV Tropes [14:18] Which allows one to perform way better searches [14:18] I contacted the creator of it, asking a question, but he has not responded and ever since access to the website has been revoked. Just try accessing it now [14:19] I'm starting to think he didn't want it to be public [14:19] I do have another URL which can still be accessed, and maybe it's just a matter of time before he revokes access to that URL as well [14:19] I'm not sure of this, though. Maybe this issue is just temporary [14:20] You can read more about dbtopres here: https://web.archive.org/web/20140722081335/http://skipforward.opendfki.de/wiki/DBTropes [14:21] Melissa_, if you give me the URL I can try pointing archivebot at it. [14:21] Yeah. There's just one problem: dbtropes isn't just the pages, but also a search engine, which is a back-end thing [14:21] Anything's better than nothing, though [14:23] *** toad1 has quit IRC (Read error: Operation timed out) [14:24] Actually, nevermind [14:24] Just found a zip of the whole database [14:30] nice :) [14:30] Melissa_, want me to archivebot the database dump? [14:39] *** mistym has joined #archiveteam [14:40] SketchCow: i'm uploading CD3WD dvds to your ftp [14:41] i figure it would be easier to upload to a ftp since its very big and i can resume if my wifi acts up [14:41] Great [14:47] *** mistym has quit IRC (Remote host closed the connection) [14:53] *** wp494 has quit IRC (LOUD UNNECESSARY QUIT MESSAGES) [14:54] Melissa_: it works for me [14:54] and they provide a database snapshot [14:54] it's weird that you can't access it... [14:55] balrog yeah, looks like I've been worrying about nothing. The issue was client-sided [14:57] Very strange but the issue is gone now that I've removed cookies, site prefs, etc. [14:57] *** Wolfie has quit IRC (Quit: Leaving.) [14:58] how can you tell if archivebot has properly scraped something? where can you see the actual total data downloaded? [14:58] would it be this bit at the end of the job? "sent 325 bytes received 30 bytes 236.67 bytes/sec" [14:58] and if so, how would you "try again"? [15:12] *** Wolfie has joined #archiveteam [15:28] *** Melissa_ has quit IRC (Quit: Leaving) [15:36] rapidshare dies tomorrow, we need more people doing rapidshare discovery [15:36] #rapidscare [16:08] *** mistym has joined #archiveteam [16:08] *** mistym_ has joined #archiveteam [16:12] *** mistym has quit IRC (Client Quit) [16:13] *** Emcy has quit IRC (Read error: Connection reset by peer) [16:24] *** mistym has joined #archiveteam [16:26] *** mistym has quit IRC (Client Quit) [16:30] *** mistym has joined #archiveteam [16:32] *** mistym_ has quit IRC (Leaving...) [16:35] Start: ask again when the tracker is actually servign jobs :P [16:35] it was when i asked [16:36] :) [16:40] *** patricko- is now known as patrickod [16:43] *** john has quit IRC (Remote host closed the connection) [16:44] *** philpem has joined #archiveteam [16:48] *** philpem_ has joined #archiveteam [16:48] *** philpem_ has quit IRC (Remote host closed the connection) [16:49] johtso: search here -> http://archive.fart.website/archivebot/viewer/ [16:49] job results are typically uploaded in 5 GB chunks; that index updates periodically [16:53] *** aaaaaaaaa has joined #archiveteam [17:07] *** patrickod is now known as patricko- [17:09] *** primus104 has joined #archiveteam [17:14] Is there a good way to pull sites out of the wayback machine? [17:15] Wolfie: if the site was injected via ArchiveBot then the WARCs can be located via http://archive.fart.website/archivebot/viewer/ [17:15] if it came from the Archive-It group then you may be able to ask them [17:15] for other cases no public access presently exists [17:17] Well goddamnit. [17:22] yipdw, do you know what that output I pasted from the logs refers to? [17:23] I always assumed that was the json file [17:23] I think those are around that big [17:24] ah, that's not so useful :) [17:25] is there any way to bypass the delay before being able to re-archive? [17:25] !expire ident [17:25] ah, nice [17:25] does !abort do anything if the job is already done? [17:25] not as far as I know [17:26] but yeah I think the json file is the only thing that gets rsynced at the end of a job now, all the warcs I think get sent in the background while the job is running (or something like that?) [17:26] I'm no expert [17:27] any way to nuke a completed archive? [17:27] https://archive.org/details/archiveteam_madden/v2 [17:28] maybe by nuking the IA data centers. Please don't do that... [17:31] johtso: it's the JSON file [17:31] makes sense [17:31] RsyncUpload exists as a catch-all for materials not delegated to the uploader [17:32] further discussion of archivebot behavior in #archivebot [17:32] * johtso enters missile launch codes [17:32] I okay, i'll keep the discussion to #archivebot .. just that it gets kind of swamped by the bot output [17:46] johtso, you talk wayyyy too much [17:46] But I think you struck something there. [17:46] Hence: #archivebot-bs [17:46] SketchCow, hahaha [17:51] *** caber has quit IRC (Quit: Doei Doei!!!) [17:57] *** caber has joined #archiveteam [18:08] this was posted on one of my forums, not sure if it's a repack of something already on the Archive or instead something that should be preserved in the Magazines collection: http://www.pixsoriginadventures.co.uk/PCZone/ [18:08] there's some overlap with https://archive.org/details/pczonemagazine?sort=-date but it has some issues that aren't on the archive... [18:16] Does someone want to run comparisons or should I do it. [18:16] I suppose I could do it quick (download all, compare) [19:14] gwern subpoenad http://www.reddit.com/r/DarkNetMarkets/comments/30tudk/psa_5_reddit_accounts_subpoenaed_by_ice/ [19:14] is http://www.gwern.net/ well archived? [19:21] *** Emcy has joined #archiveteam [19:29] *** mistym has quit IRC (Leaving) [19:43] *** mistym has joined #archiveteam [19:45] not sure, just to be sure archivebotted [19:48] *** SN4T14_ has joined #archiveteam [19:49] *** mistym has quit IRC (Remote host closed the connection) [19:53] *** mistym has joined #archiveteam [19:56] *** SN4T14__ has quit IRC (Ping timeout: 512 seconds) [20:03] *** Emcy_ has joined #archiveteam [20:07] *** Emcy has quit IRC (Ping timeout: 512 seconds) [20:26] *** mistym has quit IRC (Remote host closed the connection) [20:31] * schbirid archivebots midas' certain harddisk [20:31] LALALALA [20:39] *** mistym has joined #archiveteam [20:50] *** Emcy_ has quit IRC (Read error: Connection reset by peer) [20:52] *** signius has quit IRC (Read error: Operation timed out) [20:58] *** sankin has quit IRC (Leaving.) [21:01] *** mistym has quit IRC (Quit: Leaving) [21:04] *** Ravenloft has quit IRC (Ping timeout: 240 seconds) [21:06] *** signius has joined #archiveteam [21:18] *** SimpBrain has quit IRC (Read error: Connection reset by peer) [21:19] *** Emcy has joined #archiveteam [21:39] *** wp494 has joined #archiveteam [21:51] *** Stiletto has quit IRC (Ping timeout: 186 seconds) [21:51] *** BlueMaxim has joined #archiveteam [21:51] *** appledash has joined #archiveteam [21:52] Hey, is there an easy way / a guide for getting the warrior image running under ESXi? [21:54] *** Emcy_ has joined #archiveteam [21:55] *** Emcy has quit IRC (Ping timeout: 512 seconds) [21:58] The issue I mainly forsee is the fact that I need to assign a static IP [21:59] *** khaoohs_ has joined #archiveteam [21:59] *** khaoohs has quit IRC (Read error: Connection reset by peer) [22:11] *** mistym has joined #archiveteam [22:15] *** khaoohs_ has quit IRC (Read error: Connection reset by peer) [22:15] *** khaoohs has joined #archiveteam [22:16] here you go: http://edvoncken.net/2014/08/archiveteam-warrior-on-esxi/ [22:16] I haven't tried it, but if it works, please add it to the wiki [22:29] *** philpem has quit IRC (Ping timeout: 260 seconds) [22:46] *** wp494 has quit IRC (LOUD UNNECESSARY QUIT MESSAGES) [22:50] *** NovaKing_ has quit IRC (Read error: Operation timed out) [22:51] *** NovaKing_ has joined #archiveteam [22:53] *** wp494 has joined #archiveteam [23:01] *** NovaKing_ has quit IRC (Read error: Operation timed out) [23:02] *** NovaKing_ has joined #archiveteam [23:11] *** dashcloud has quit IRC (Read error: Operation timed out) [23:22] *** aschmitz has quit IRC (Read error: Connection reset by peer) [23:25] *** dashcloud has joined #archiveteam [23:27] *** Wolfie has left [23:44] *** Stiletto has joined #archiveteam [23:52] *** BlueMaxim has quit IRC (Quit: Leaving) [23:52] *** toad has joined #archiveteam