#archiveteam 2014-12-18,Thu

↑back Search

Time Nickname Message
00:06 🔗 Coderjoe has quit IRC (Ping timeout: 600 seconds)
00:06 🔗 wp494 has quit IRC (Ping timeout: 335 seconds)
00:06 🔗 nyu has quit IRC (Quit: leaving)
00:07 🔗 Coderjoe has joined #archiveteam
00:13 🔗 ete_ has quit IRC (Read error: Connection reset by peer)
00:13 🔗 wp494 has joined #archiveteam
00:13 🔗 ete_ has joined #archiveteam
00:17 🔗 brayden has joined #archiveteam
01:20 🔗 chfoo installing a new nginx+passenger seems to have made the problem go away
01:21 🔗 chfoo for continued tracker discussion, join #warrior
01:24 🔗 mistym has quit IRC (Remote host closed the connection)
01:38 🔗 Kazzy has quit IRC (Quit: ZNC - http://znc.in)
01:46 🔗 Kazzy has joined #archiveteam
01:58 🔗 primus104 has quit IRC (Leaving.)
02:11 🔗 nyu has joined #archiveteam
02:19 🔗 Ymgve has quit IRC ()
02:19 🔗 dashcloud has quit IRC (Quit: No Ping reply in 180 seconds.)
02:21 🔗 dashcloud has joined #archiveteam
02:27 🔗 Kazzy has quit IRC (Quit: ZNC - http://znc.in)
02:36 🔗 Kazzy has joined #archiveteam
02:57 🔗 db48x has joined #archiveteam
03:02 🔗 BiggieJon has joined #archiveteam
03:08 🔗 khaoohs_ has joined #archiveteam
03:08 🔗 khaoohs has quit IRC (Read error: Connection reset by peer)
03:23 🔗 dashcloud has quit IRC (Read error: Connection reset by peer)
03:24 🔗 dashcloud has joined #archiveteam
03:43 🔗 xk_id has quit IRC (Ping timeout: 480 seconds)
04:18 🔗 nyu has quit IRC (leaving)
04:29 🔗 SketchCow Did we grab ivillage?
04:36 🔗 ete_ has quit IRC (Remote host closed the connection)
04:43 🔗 mistym has joined #archiveteam
04:44 🔗 yipdw SketchCow: still in progress, ~119,000 URLs to go
04:44 🔗 SketchCow Thanks.
04:57 🔗 Nertsy has quit IRC (Quit: Nertsy)
05:03 🔗 Nertsy has joined #archiveteam
05:05 🔗 aaaaaaaaa has quit IRC (Leaving)
05:11 🔗 brayden has quit IRC (Read error: Operation timed out)
05:16 🔗 brayden has joined #archiveteam
05:27 🔗 VonScoot SketchCow: what's yer beef with Binstock?
05:28 🔗 SketchCow ha ha HA ha ha ha
05:28 🔗 SketchCow Imagine there's a limpwriting machine
05:28 🔗 SketchCow imagine he sat under it, on high, for most of the day
05:28 🔗 SketchCow and then wrote that editorial
07:28 🔗 primus104 has joined #archiveteam
07:48 🔗 mistym has quit IRC (Remote host closed the connection)
07:59 🔗 primus104 has quit IRC (Leaving.)
08:00 🔗 signius has quit IRC (Read error: Operation timed out)
08:07 🔗 SketchCow VonScoot: It was an idiot editorial. Someone showing how completely idiot and cluless he was.
08:07 🔗 SketchCow Which is fine, one does not expect the online-only endgame of a long-standing magazine to have a winner at the helm.
08:13 🔗 signius has joined #archiveteam
08:43 🔗 ersi has quit IRC (Read error: Operation timed out)
08:45 🔗 ersi has joined #archiveteam
08:45 🔗 swebb sets mode: +o ersi
08:55 🔗 schbirid has joined #archiveteam
09:30 🔗 SketchCow http://imgur.com/gallery/bpkHSif
09:55 🔗 wp494 has quit IRC (Ping timeout: 272 seconds)
10:11 🔗 primus104 has joined #archiveteam
10:19 🔗 APerti has quit IRC ()
10:20 🔗 cadbury_ is it normal for the warrior to restart itself?
10:24 🔗 midas yep
10:25 🔗 wp494 has joined #archiveteam
10:39 🔗 BlueMaxim has quit IRC (Quit: Leaving)
10:45 🔗 xk_id has joined #archiveteam
11:02 🔗 fluff is now known as fluff_
11:32 🔗 ruukasu has quit IRC (Ping timeout: 265 seconds)
11:34 🔗 dashcloud has quit IRC (Read error: Operation timed out)
11:35 🔗 ersi cadbury_: Yeah, it's to make sure it's updated and that the project code is updated.
11:38 🔗 dashcloud has joined #archiveteam
11:45 🔗 Ymgve has joined #archiveteam
11:49 🔗 Selanda has quit IRC (Ping timeout: 252 seconds)
11:50 🔗 cadbury_ nice, i shan't worry about it then
11:50 🔗 cadbury_ presumably i can start as many warriors as i like?
12:43 🔗 primus has quit IRC (Read error: Connection reset by peer)
12:49 🔗 MorbusIff has quit IRC (Quit: http://www.disobey.com/)
12:52 🔗 Morbus has joined #archiveteam
12:52 🔗 brayden_ has joined #archiveteam
12:58 🔗 brayden has quit IRC (Read error: Operation timed out)
13:00 🔗 brayden has joined #archiveteam
13:01 🔗 Sellyme_ has quit IRC (Ping timeout: 246 seconds)
13:04 🔗 brayden_ has quit IRC (Read error: Operation timed out)
13:06 🔗 Sellyme has joined #archiveteam
13:07 🔗 brayden has quit IRC (Read error: Operation timed out)
13:11 🔗 balrog """Activist investors are pushing for a Yahoo-AOL merge"???
13:14 🔗 db48x cadbury_: in theory, sure. however, running too many things on one IP address can get that address banned
13:15 🔗 db48x cadbury_: that said, if you want to get a bit more involved you can run the software outside of the warrior VM, where you'll have a lot more flexibility
13:20 🔗 ruukasu has joined #archiveteam
13:24 🔗 ruukasu has quit IRC (Client Quit)
13:25 🔗 ruukasu has joined #archiveteam
13:25 🔗 ruukasu has quit IRC (Client Quit)
13:26 🔗 ruukasu has joined #archiveteam
13:26 🔗 ruukasu has quit IRC (Client Quit)
13:27 🔗 joepie91 balrog: oh god
13:27 🔗 joepie91 that can only go wrong
13:27 🔗 joepie91 horribly, horribly wrong
13:29 🔗 ruukasu has joined #archiveteam
13:37 🔗 sankin has joined #archiveteam
13:42 🔗 midas how could that go wr.
13:43 🔗 midas gone.
13:43 🔗 midas all gone.
13:55 🔗 brayden has joined #archiveteam
14:13 🔗 brayden has quit IRC (Ping timeout: 606 seconds)
14:14 🔗 w0rp God help us all.
14:16 🔗 brayden has joined #archiveteam
14:20 🔗 xk_id has quit IRC (Read error: Operation timed out)
14:32 🔗 cadbury_ db48x: what does running outside of the warrior change/do?
14:34 🔗 Kazzy given you more control over what you run, essentially
14:34 🔗 Kazzy instead of running 5 vm's, just run the scripts 5 times, less overhead
14:35 🔗 cadbury_ oh, well that makes sense
14:35 🔗 cadbury_ presumably you can have each script running on a different port for the webui or is that a separate process?
14:35 🔗 Kazzy you can yes, or just disable it when you run the script
14:39 🔗 db48x yea, you can ditch the web ui, and run more concurrent downloaders (and uploaders) than the web ui limits you to
14:39 🔗 cadbury_ is there much advantage to running more?
14:39 🔗 db48x it depends
14:40 🔗 db48x some projects are really banhappy
14:40 🔗 db48x occasionally we've been able to download so fast that we filled up our staging area
14:41 🔗 db48x in both cases we have the tracker apply really strong rate limits
14:42 🔗 db48x which means that running more workers won't really get the work done faster, although you might be able to steal a larger slice of the work
14:43 🔗 cadbury_ i suppose one advantage would be being able to run 1 worker per project available
14:44 🔗 db48x yea, you could do that
14:45 🔗 db48x although I think only twitpic and urlteam are currently in progress
14:45 🔗 cadbury_ multi-URL team scrapers would probably work without a problem
14:46 🔗 db48x yea, urlteam is an interesting case
14:46 🔗 db48x with those they're often scraping multiple shorteners at the same time, and you can have one work unit assigned to you for each of them
14:47 🔗 cadbury_ i don't have enough spare hardware left over for more VMs
14:47 🔗 db48x ooh, looks like they're doing a bunch of shortners right now, so you can go nuts
14:49 🔗 db48x you can run the script directly: https://github.com/ArchiveTeam/terroroftinytown-client-grab
14:50 🔗 Froggypwn has quit IRC (Read error: Connection reset by peer)
14:51 🔗 cadbury_ the amount of code that actually makes that work is surprisingly small
14:52 🔗 ruukasu has quit IRC (Ping timeout: 265 seconds)
14:52 🔗 Froggypwn has joined #archiveteam
14:55 🔗 db48x yep
14:56 🔗 db48x pipeline.py contains the code that defines what steps are necessary to process a work unit
14:56 🔗 db48x it's leaning heavily on Seesaw to provide most of the heavy lifting of running processes and managing concurrency and so on
14:57 🔗 db48x the program that actual interrogates the url shortener comes from a different git repository, but it's not very long either
15:01 🔗 db48x twitpic is here: https://github.com/ArchiveTeam/twitpic-grab2
15:03 🔗 db48x you can see that it's pipeline is a bit more complex
15:28 🔗 Emcy_ has quit IRC (Read error: Connection reset by peer)
15:34 🔗 mistym has joined #archiveteam
15:40 🔗 mistym has quit IRC (Remote host closed the connection)
15:47 🔗 khaoohs_ has quit IRC (Read error: Connection reset by peer)
15:52 🔗 khaoohs has joined #archiveteam
16:01 🔗 fluff_ is now known as fluff
16:02 🔗 mistym has joined #archiveteam
16:03 🔗 Emcy has joined #archiveteam
16:14 🔗 xk_id has joined #archiveteam
16:17 🔗 Nemo_bis has joined #archiveteam
16:20 🔗 SPF|Cloud has joined #archiveteam
16:26 🔗 SPF|Cloud is now known as Southpark
16:26 🔗 Southpark is now known as SPF|Cloud
16:26 🔗 Nemo_bis The torrent of https://archive.org/details/URLTeamTorrentRelease2013July doesn't include any file
16:27 🔗 Nemo_bis SketchCow: can you regenerate the torrent?
16:27 🔗 SketchCow I just set it off.
16:28 🔗 SketchCow It's been a hell of a emscripten-DOSBOX Bender this week
16:36 🔗 schbirid https://news.ycombinator.com/item?id=8767909
16:49 🔗 primus104 has quit IRC (Leaving.)
16:51 🔗 aaaaaaaaa has joined #archiveteam
16:54 🔗 mistym has quit IRC (Remote host closed the connection)
17:11 🔗 mistym has joined #archiveteam
17:29 🔗 primus104 has joined #archiveteam
18:22 🔗 K4k has joined #archiveteam
18:33 🔗 raylee SketchCow: youre porting dosbox to the browser?
19:16 🔗 SketchCow Someone already has done it, and it's done.
19:16 🔗 SketchCow Now I'm just trying to make it work with the archive.org structure, which has some unusual aspects.
19:39 🔗 BlueMaxim has joined #archiveteam
20:17 🔗 APerti has joined #archiveteam
20:29 🔗 Ravenloft has joined #archiveteam
20:35 🔗 mistym has quit IRC (Remote host closed the connection)
20:53 🔗 Start has joined #archiveteam
21:01 🔗 mistym has joined #archiveteam
21:06 🔗 primus104 has quit IRC (Leaving.)
21:10 🔗 joepie91 no data lost, but another Yahoo acquisition apparently
21:10 🔗 joepie91 https://peercdn.com/
21:10 🔗 joepie91 PeerCDN Acquired by Yahoo!
21:13 🔗 K4k has quit IRC (WeeChat 1.0.1)
21:22 🔗 Start has quit IRC (Ping timeout: 365 seconds)
21:23 🔗 mistym has quit IRC (Remote host closed the connection)
21:27 🔗 Start has joined #archiveteam
21:29 🔗 deathy that's since quite a few months actually
21:32 🔗 deathy one of the founders started webtorrent I think right after. BitTorrent just using a browser
21:40 🔗 Start has quit IRC (Quit: Leaving)
21:40 🔗 godane i'm starting to upload more funny or die videos
21:42 🔗 schbirid has quit IRC (Read error: Operation timed out)
21:46 🔗 mistym has joined #archiveteam
21:52 🔗 schbirid has joined #archiveteam
21:53 🔗 sankin has quit IRC (Leaving.)
21:58 🔗 primus104 has joined #archiveteam
22:03 🔗 ruukasu has joined #archiveteam
22:54 🔗 schbirid has quit IRC (Leaving)
23:11 🔗 godane so nine to noon show on radionz is about 11gb a year
23:21 🔗 godane good news is at this rate i will have the backlog of that show in the archive soon
23:22 🔗 godane and then i just have to wait for christmas eve to start downloading the index of 2014 urls for that show
23:22 🔗 godane they end on christmas eve and don't start back until jan ~20
23:39 🔗 rejon has joined #archiveteam
23:43 🔗 APerti has quit IRC (Ping timeout: 370 seconds)
