[00:06] *** Coderjoe has quit IRC (Ping timeout: 600 seconds) [00:06] *** wp494 has quit IRC (Ping timeout: 335 seconds) [00:06] *** nyu has quit IRC (Quit: leaving) [00:07] *** Coderjoe has joined #archiveteam [00:13] *** ete_ has quit IRC (Read error: Connection reset by peer) [00:13] *** wp494 has joined #archiveteam [00:13] *** ete_ has joined #archiveteam [00:17] *** brayden has joined #archiveteam [01:20] installing a new nginx+passenger seems to have made the problem go away [01:21] for continued tracker discussion, join #warrior [01:24] *** mistym has quit IRC (Remote host closed the connection) [01:38] *** Kazzy has quit IRC (Quit: ZNC - http://znc.in) [01:46] *** Kazzy has joined #archiveteam [01:58] *** primus104 has quit IRC (Leaving.) [02:11] *** nyu has joined #archiveteam [02:19] *** Ymgve has quit IRC () [02:19] *** dashcloud has quit IRC (Quit: No Ping reply in 180 seconds.) [02:21] *** dashcloud has joined #archiveteam [02:27] *** Kazzy has quit IRC (Quit: ZNC - http://znc.in) [02:36] *** Kazzy has joined #archiveteam [02:57] *** db48x has joined #archiveteam [03:02] *** BiggieJon has joined #archiveteam [03:08] *** khaoohs_ has joined #archiveteam [03:08] *** khaoohs has quit IRC (Read error: Connection reset by peer) [03:23] *** dashcloud has quit IRC (Read error: Connection reset by peer) [03:24] *** dashcloud has joined #archiveteam [03:43] *** xk_id has quit IRC (Ping timeout: 480 seconds) [04:18] *** nyu has quit IRC (leaving) [04:29] Did we grab ivillage? [04:36] *** ete_ has quit IRC (Remote host closed the connection) [04:43] *** mistym has joined #archiveteam [04:44] SketchCow: still in progress, ~119,000 URLs to go [04:44] Thanks. [04:57] *** Nertsy has quit IRC (Quit: Nertsy) [05:03] *** Nertsy has joined #archiveteam [05:05] *** aaaaaaaaa has quit IRC (Leaving) [05:11] *** brayden has quit IRC (Read error: Operation timed out) [05:16] *** brayden has joined #archiveteam [05:27] SketchCow: what's yer beef with Binstock? [05:28] ha ha HA ha ha ha [05:28] Imagine there's a limpwriting machine [05:28] imagine he sat under it, on high, for most of the day [05:28] and then wrote that editorial [07:28] *** primus104 has joined #archiveteam [07:48] *** mistym has quit IRC (Remote host closed the connection) [07:59] *** primus104 has quit IRC (Leaving.) [08:00] *** signius has quit IRC (Read error: Operation timed out) [08:07] VonScoot: It was an idiot editorial. Someone showing how completely idiot and cluless he was. [08:07] Which is fine, one does not expect the online-only endgame of a long-standing magazine to have a winner at the helm. [08:13] *** signius has joined #archiveteam [08:43] *** ersi has quit IRC (Read error: Operation timed out) [08:45] *** ersi has joined #archiveteam [08:45] *** swebb sets mode: +o ersi [08:55] *** schbirid has joined #archiveteam [09:30] http://imgur.com/gallery/bpkHSif [09:55] *** wp494 has quit IRC (Ping timeout: 272 seconds) [10:11] *** primus104 has joined #archiveteam [10:19] *** APerti has quit IRC () [10:20] is it normal for the warrior to restart itself? [10:24] yep [10:25] *** wp494 has joined #archiveteam [10:39] *** BlueMaxim has quit IRC (Quit: Leaving) [10:45] *** xk_id has joined #archiveteam [11:02] *** fluff is now known as fluff_ [11:32] *** ruukasu has quit IRC (Ping timeout: 265 seconds) [11:34] *** dashcloud has quit IRC (Read error: Operation timed out) [11:35] cadbury_: Yeah, it's to make sure it's updated and that the project code is updated. [11:38] *** dashcloud has joined #archiveteam [11:45] *** Ymgve has joined #archiveteam [11:49] *** Selanda has quit IRC (Ping timeout: 252 seconds) [11:50] nice, i shan't worry about it then [11:50] presumably i can start as many warriors as i like? [12:43] *** primus has quit IRC (Read error: Connection reset by peer) [12:49] *** MorbusIff has quit IRC (Quit: http://www.disobey.com/) [12:52] *** Morbus has joined #archiveteam [12:52] *** brayden_ has joined #archiveteam [12:58] *** brayden has quit IRC (Read error: Operation timed out) [13:00] *** brayden has joined #archiveteam [13:01] *** Sellyme_ has quit IRC (Ping timeout: 246 seconds) [13:04] *** brayden_ has quit IRC (Read error: Operation timed out) [13:06] *** Sellyme has joined #archiveteam [13:07] *** brayden has quit IRC (Read error: Operation timed out) [13:11] """Activist investors are pushing for a Yahoo-AOL merge"??? [13:14] cadbury_: in theory, sure. however, running too many things on one IP address can get that address banned [13:15] cadbury_: that said, if you want to get a bit more involved you can run the software outside of the warrior VM, where you'll have a lot more flexibility [13:20] *** ruukasu has joined #archiveteam [13:24] *** ruukasu has quit IRC (Client Quit) [13:25] *** ruukasu has joined #archiveteam [13:25] *** ruukasu has quit IRC (Client Quit) [13:26] *** ruukasu has joined #archiveteam [13:26] *** ruukasu has quit IRC (Client Quit) [13:27] balrog: oh god [13:27] that can only go wrong [13:27] horribly, horribly wrong [13:29] *** ruukasu has joined #archiveteam [13:37] *** sankin has joined #archiveteam [13:42] how could that go wr. [13:43] gone. [13:43] all gone. [13:55] *** brayden has joined #archiveteam [14:13] *** brayden has quit IRC (Ping timeout: 606 seconds) [14:14] God help us all. [14:16] *** brayden has joined #archiveteam [14:20] *** xk_id has quit IRC (Read error: Operation timed out) [14:32] db48x: what does running outside of the warrior change/do? [14:34] given you more control over what you run, essentially [14:34] instead of running 5 vm's, just run the scripts 5 times, less overhead [14:35] oh, well that makes sense [14:35] presumably you can have each script running on a different port for the webui or is that a separate process? [14:35] you can yes, or just disable it when you run the script [14:39] yea, you can ditch the web ui, and run more concurrent downloaders (and uploaders) than the web ui limits you to [14:39] is there much advantage to running more? [14:39] it depends [14:40] some projects are really banhappy [14:40] occasionally we've been able to download so fast that we filled up our staging area [14:41] in both cases we have the tracker apply really strong rate limits [14:42] which means that running more workers won't really get the work done faster, although you might be able to steal a larger slice of the work [14:43] i suppose one advantage would be being able to run 1 worker per project available [14:44] yea, you could do that [14:45] although I think only twitpic and urlteam are currently in progress [14:45] multi-URL team scrapers would probably work without a problem [14:46] yea, urlteam is an interesting case [14:46] with those they're often scraping multiple shorteners at the same time, and you can have one work unit assigned to you for each of them [14:47] i don't have enough spare hardware left over for more VMs [14:47] ooh, looks like they're doing a bunch of shortners right now, so you can go nuts [14:49] you can run the script directly: https://github.com/ArchiveTeam/terroroftinytown-client-grab [14:50] *** Froggypwn has quit IRC (Read error: Connection reset by peer) [14:51] the amount of code that actually makes that work is surprisingly small [14:52] *** ruukasu has quit IRC (Ping timeout: 265 seconds) [14:52] *** Froggypwn has joined #archiveteam [14:55] yep [14:56] pipeline.py contains the code that defines what steps are necessary to process a work unit [14:56] it's leaning heavily on Seesaw to provide most of the heavy lifting of running processes and managing concurrency and so on [14:57] the program that actual interrogates the url shortener comes from a different git repository, but it's not very long either [15:01] twitpic is here: https://github.com/ArchiveTeam/twitpic-grab2 [15:03] you can see that it's pipeline is a bit more complex [15:28] *** Emcy_ has quit IRC (Read error: Connection reset by peer) [15:34] *** mistym has joined #archiveteam [15:40] *** mistym has quit IRC (Remote host closed the connection) [15:47] *** khaoohs_ has quit IRC (Read error: Connection reset by peer) [15:52] *** khaoohs has joined #archiveteam [16:01] *** fluff_ is now known as fluff [16:02] *** mistym has joined #archiveteam [16:03] *** Emcy has joined #archiveteam [16:14] *** xk_id has joined #archiveteam [16:17] *** Nemo_bis has joined #archiveteam [16:20] *** SPF|Cloud has joined #archiveteam [16:26] *** SPF|Cloud is now known as Southpark [16:26] *** Southpark is now known as SPF|Cloud [16:26] The torrent of https://archive.org/details/URLTeamTorrentRelease2013July doesn't include any file [16:27] SketchCow: can you regenerate the torrent? [16:27] I just set it off. [16:28] It's been a hell of a emscripten-DOSBOX Bender this week [16:36] https://news.ycombinator.com/item?id=8767909 [16:49] *** primus104 has quit IRC (Leaving.) [16:51] *** aaaaaaaaa has joined #archiveteam [16:54] *** mistym has quit IRC (Remote host closed the connection) [17:11] *** mistym has joined #archiveteam [17:29] *** primus104 has joined #archiveteam [18:22] *** K4k has joined #archiveteam [18:33] SketchCow: youre porting dosbox to the browser? [19:16] Someone already has done it, and it's done. [19:16] Now I'm just trying to make it work with the archive.org structure, which has some unusual aspects. [19:39] *** BlueMaxim has joined #archiveteam [20:17] *** APerti has joined #archiveteam [20:29] *** Ravenloft has joined #archiveteam [20:35] *** mistym has quit IRC (Remote host closed the connection) [20:53] *** Start has joined #archiveteam [21:01] *** mistym has joined #archiveteam [21:06] *** primus104 has quit IRC (Leaving.) [21:10] no data lost, but another Yahoo acquisition apparently [21:10] https://peercdn.com/ [21:10] PeerCDN Acquired by Yahoo! [21:13] *** K4k has quit IRC (WeeChat 1.0.1) [21:22] *** Start has quit IRC (Ping timeout: 365 seconds) [21:23] *** mistym has quit IRC (Remote host closed the connection) [21:27] *** Start has joined #archiveteam [21:29] that's since quite a few months actually [21:32] one of the founders started webtorrent I think right after. BitTorrent just using a browser [21:40] *** Start has quit IRC (Quit: Leaving) [21:40] i'm starting to upload more funny or die videos [21:42] *** schbirid has quit IRC (Read error: Operation timed out) [21:46] *** mistym has joined #archiveteam [21:52] *** schbirid has joined #archiveteam [21:53] *** sankin has quit IRC (Leaving.) [21:58] *** primus104 has joined #archiveteam [22:03] *** ruukasu has joined #archiveteam [22:54] *** schbirid has quit IRC (Leaving) [23:11] so nine to noon show on radionz is about 11gb a year [23:21] good news is at this rate i will have the backlog of that show in the archive soon [23:22] and then i just have to wait for christmas eve to start downloading the index of 2014 urls for that show [23:22] they end on christmas eve and don't start back until jan ~20 [23:39] *** rejon has joined #archiveteam [23:43] *** APerti has quit IRC (Ping timeout: 370 seconds)