| Time |
Nickname |
Message |
|
00:06
🔗
|
|
Coderjoe has quit IRC (Ping timeout: 600 seconds) |
|
00:06
🔗
|
|
wp494 has quit IRC (Ping timeout: 335 seconds) |
|
00:06
🔗
|
|
nyu has quit IRC (Quit: leaving) |
|
00:07
🔗
|
|
Coderjoe has joined #archiveteam |
|
00:13
🔗
|
|
ete_ has quit IRC (Read error: Connection reset by peer) |
|
00:13
🔗
|
|
wp494 has joined #archiveteam |
|
00:13
🔗
|
|
ete_ has joined #archiveteam |
|
00:17
🔗
|
|
brayden has joined #archiveteam |
|
01:20
🔗
|
chfoo |
installing a new nginx+passenger seems to have made the problem go away |
|
01:21
🔗
|
chfoo |
for continued tracker discussion, join #warrior |
|
01:24
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
|
01:38
🔗
|
|
Kazzy has quit IRC (Quit: ZNC - http://znc.in) |
|
01:46
🔗
|
|
Kazzy has joined #archiveteam |
|
01:58
🔗
|
|
primus104 has quit IRC (Leaving.) |
|
02:11
🔗
|
|
nyu has joined #archiveteam |
|
02:19
🔗
|
|
Ymgve has quit IRC () |
|
02:19
🔗
|
|
dashcloud has quit IRC (Quit: No Ping reply in 180 seconds.) |
|
02:21
🔗
|
|
dashcloud has joined #archiveteam |
|
02:27
🔗
|
|
Kazzy has quit IRC (Quit: ZNC - http://znc.in) |
|
02:36
🔗
|
|
Kazzy has joined #archiveteam |
|
02:57
🔗
|
|
db48x has joined #archiveteam |
|
03:02
🔗
|
|
BiggieJon has joined #archiveteam |
|
03:08
🔗
|
|
khaoohs_ has joined #archiveteam |
|
03:08
🔗
|
|
khaoohs has quit IRC (Read error: Connection reset by peer) |
|
03:23
🔗
|
|
dashcloud has quit IRC (Read error: Connection reset by peer) |
|
03:24
🔗
|
|
dashcloud has joined #archiveteam |
|
03:43
🔗
|
|
xk_id has quit IRC (Ping timeout: 480 seconds) |
|
04:18
🔗
|
|
nyu has quit IRC (leaving) |
|
04:29
🔗
|
SketchCow |
Did we grab ivillage? |
|
04:36
🔗
|
|
ete_ has quit IRC (Remote host closed the connection) |
|
04:43
🔗
|
|
mistym has joined #archiveteam |
|
04:44
🔗
|
yipdw |
SketchCow: still in progress, ~119,000 URLs to go |
|
04:44
🔗
|
SketchCow |
Thanks. |
|
04:57
🔗
|
|
Nertsy has quit IRC (Quit: Nertsy) |
|
05:03
🔗
|
|
Nertsy has joined #archiveteam |
|
05:05
🔗
|
|
aaaaaaaaa has quit IRC (Leaving) |
|
05:11
🔗
|
|
brayden has quit IRC (Read error: Operation timed out) |
|
05:16
🔗
|
|
brayden has joined #archiveteam |
|
05:27
🔗
|
VonScoot |
SketchCow: what's yer beef with Binstock? |
|
05:28
🔗
|
SketchCow |
ha ha HA ha ha ha |
|
05:28
🔗
|
SketchCow |
Imagine there's a limpwriting machine |
|
05:28
🔗
|
SketchCow |
imagine he sat under it, on high, for most of the day |
|
05:28
🔗
|
SketchCow |
and then wrote that editorial |
|
07:28
🔗
|
|
primus104 has joined #archiveteam |
|
07:48
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
|
07:59
🔗
|
|
primus104 has quit IRC (Leaving.) |
|
08:00
🔗
|
|
signius has quit IRC (Read error: Operation timed out) |
|
08:07
🔗
|
SketchCow |
VonScoot: It was an idiot editorial. Someone showing how completely idiot and cluless he was. |
|
08:07
🔗
|
SketchCow |
Which is fine, one does not expect the online-only endgame of a long-standing magazine to have a winner at the helm. |
|
08:13
🔗
|
|
signius has joined #archiveteam |
|
08:43
🔗
|
|
ersi has quit IRC (Read error: Operation timed out) |
|
08:45
🔗
|
|
ersi has joined #archiveteam |
|
08:45
🔗
|
|
swebb sets mode: +o ersi |
|
08:55
🔗
|
|
schbirid has joined #archiveteam |
|
09:30
🔗
|
SketchCow |
http://imgur.com/gallery/bpkHSif |
|
09:55
🔗
|
|
wp494 has quit IRC (Ping timeout: 272 seconds) |
|
10:11
🔗
|
|
primus104 has joined #archiveteam |
|
10:19
🔗
|
|
APerti has quit IRC () |
|
10:20
🔗
|
cadbury_ |
is it normal for the warrior to restart itself? |
|
10:24
🔗
|
midas |
yep |
|
10:25
🔗
|
|
wp494 has joined #archiveteam |
|
10:39
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
|
10:45
🔗
|
|
xk_id has joined #archiveteam |
|
11:02
🔗
|
|
fluff is now known as fluff_ |
|
11:32
🔗
|
|
ruukasu has quit IRC (Ping timeout: 265 seconds) |
|
11:34
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
|
11:35
🔗
|
ersi |
cadbury_: Yeah, it's to make sure it's updated and that the project code is updated. |
|
11:38
🔗
|
|
dashcloud has joined #archiveteam |
|
11:45
🔗
|
|
Ymgve has joined #archiveteam |
|
11:49
🔗
|
|
Selanda has quit IRC (Ping timeout: 252 seconds) |
|
11:50
🔗
|
cadbury_ |
nice, i shan't worry about it then |
|
11:50
🔗
|
cadbury_ |
presumably i can start as many warriors as i like? |
|
12:43
🔗
|
|
primus has quit IRC (Read error: Connection reset by peer) |
|
12:49
🔗
|
|
MorbusIff has quit IRC (Quit: http://www.disobey.com/) |
|
12:52
🔗
|
|
Morbus has joined #archiveteam |
|
12:52
🔗
|
|
brayden_ has joined #archiveteam |
|
12:58
🔗
|
|
brayden has quit IRC (Read error: Operation timed out) |
|
13:00
🔗
|
|
brayden has joined #archiveteam |
|
13:01
🔗
|
|
Sellyme_ has quit IRC (Ping timeout: 246 seconds) |
|
13:04
🔗
|
|
brayden_ has quit IRC (Read error: Operation timed out) |
|
13:06
🔗
|
|
Sellyme has joined #archiveteam |
|
13:07
🔗
|
|
brayden has quit IRC (Read error: Operation timed out) |
|
13:11
🔗
|
balrog |
"""Activist investors are pushing for a Yahoo-AOL merge"??? |
|
13:14
🔗
|
db48x |
cadbury_: in theory, sure. however, running too many things on one IP address can get that address banned |
|
13:15
🔗
|
db48x |
cadbury_: that said, if you want to get a bit more involved you can run the software outside of the warrior VM, where you'll have a lot more flexibility |
|
13:20
🔗
|
|
ruukasu has joined #archiveteam |
|
13:24
🔗
|
|
ruukasu has quit IRC (Client Quit) |
|
13:25
🔗
|
|
ruukasu has joined #archiveteam |
|
13:25
🔗
|
|
ruukasu has quit IRC (Client Quit) |
|
13:26
🔗
|
|
ruukasu has joined #archiveteam |
|
13:26
🔗
|
|
ruukasu has quit IRC (Client Quit) |
|
13:27
🔗
|
joepie91 |
balrog: oh god |
|
13:27
🔗
|
joepie91 |
that can only go wrong |
|
13:27
🔗
|
joepie91 |
horribly, horribly wrong |
|
13:29
🔗
|
|
ruukasu has joined #archiveteam |
|
13:37
🔗
|
|
sankin has joined #archiveteam |
|
13:42
🔗
|
midas |
how could that go wr. |
|
13:43
🔗
|
midas |
gone. |
|
13:43
🔗
|
midas |
all gone. |
|
13:55
🔗
|
|
brayden has joined #archiveteam |
|
14:13
🔗
|
|
brayden has quit IRC (Ping timeout: 606 seconds) |
|
14:14
🔗
|
w0rp |
God help us all. |
|
14:16
🔗
|
|
brayden has joined #archiveteam |
|
14:20
🔗
|
|
xk_id has quit IRC (Read error: Operation timed out) |
|
14:32
🔗
|
cadbury_ |
db48x: what does running outside of the warrior change/do? |
|
14:34
🔗
|
Kazzy |
given you more control over what you run, essentially |
|
14:34
🔗
|
Kazzy |
instead of running 5 vm's, just run the scripts 5 times, less overhead |
|
14:35
🔗
|
cadbury_ |
oh, well that makes sense |
|
14:35
🔗
|
cadbury_ |
presumably you can have each script running on a different port for the webui or is that a separate process? |
|
14:35
🔗
|
Kazzy |
you can yes, or just disable it when you run the script |
|
14:39
🔗
|
db48x |
yea, you can ditch the web ui, and run more concurrent downloaders (and uploaders) than the web ui limits you to |
|
14:39
🔗
|
cadbury_ |
is there much advantage to running more? |
|
14:39
🔗
|
db48x |
it depends |
|
14:40
🔗
|
db48x |
some projects are really banhappy |
|
14:40
🔗
|
db48x |
occasionally we've been able to download so fast that we filled up our staging area |
|
14:41
🔗
|
db48x |
in both cases we have the tracker apply really strong rate limits |
|
14:42
🔗
|
db48x |
which means that running more workers won't really get the work done faster, although you might be able to steal a larger slice of the work |
|
14:43
🔗
|
cadbury_ |
i suppose one advantage would be being able to run 1 worker per project available |
|
14:44
🔗
|
db48x |
yea, you could do that |
|
14:45
🔗
|
db48x |
although I think only twitpic and urlteam are currently in progress |
|
14:45
🔗
|
cadbury_ |
multi-URL team scrapers would probably work without a problem |
|
14:46
🔗
|
db48x |
yea, urlteam is an interesting case |
|
14:46
🔗
|
db48x |
with those they're often scraping multiple shorteners at the same time, and you can have one work unit assigned to you for each of them |
|
14:47
🔗
|
cadbury_ |
i don't have enough spare hardware left over for more VMs |
|
14:47
🔗
|
db48x |
ooh, looks like they're doing a bunch of shortners right now, so you can go nuts |
|
14:49
🔗
|
db48x |
you can run the script directly: https://github.com/ArchiveTeam/terroroftinytown-client-grab |
|
14:50
🔗
|
|
Froggypwn has quit IRC (Read error: Connection reset by peer) |
|
14:51
🔗
|
cadbury_ |
the amount of code that actually makes that work is surprisingly small |
|
14:52
🔗
|
|
ruukasu has quit IRC (Ping timeout: 265 seconds) |
|
14:52
🔗
|
|
Froggypwn has joined #archiveteam |
|
14:55
🔗
|
db48x |
yep |
|
14:56
🔗
|
db48x |
pipeline.py contains the code that defines what steps are necessary to process a work unit |
|
14:56
🔗
|
db48x |
it's leaning heavily on Seesaw to provide most of the heavy lifting of running processes and managing concurrency and so on |
|
14:57
🔗
|
db48x |
the program that actual interrogates the url shortener comes from a different git repository, but it's not very long either |
|
15:01
🔗
|
db48x |
twitpic is here: https://github.com/ArchiveTeam/twitpic-grab2 |
|
15:03
🔗
|
db48x |
you can see that it's pipeline is a bit more complex |
|
15:28
🔗
|
|
Emcy_ has quit IRC (Read error: Connection reset by peer) |
|
15:34
🔗
|
|
mistym has joined #archiveteam |
|
15:40
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
|
15:47
🔗
|
|
khaoohs_ has quit IRC (Read error: Connection reset by peer) |
|
15:52
🔗
|
|
khaoohs has joined #archiveteam |
|
16:01
🔗
|
|
fluff_ is now known as fluff |
|
16:02
🔗
|
|
mistym has joined #archiveteam |
|
16:03
🔗
|
|
Emcy has joined #archiveteam |
|
16:14
🔗
|
|
xk_id has joined #archiveteam |
|
16:17
🔗
|
|
Nemo_bis has joined #archiveteam |
|
16:20
🔗
|
|
SPF|Cloud has joined #archiveteam |
|
16:26
🔗
|
|
SPF|Cloud is now known as Southpark |
|
16:26
🔗
|
|
Southpark is now known as SPF|Cloud |
|
16:26
🔗
|
Nemo_bis |
The torrent of https://archive.org/details/URLTeamTorrentRelease2013July doesn't include any file |
|
16:27
🔗
|
Nemo_bis |
SketchCow: can you regenerate the torrent? |
|
16:27
🔗
|
SketchCow |
I just set it off. |
|
16:28
🔗
|
SketchCow |
It's been a hell of a emscripten-DOSBOX Bender this week |
|
16:36
🔗
|
schbirid |
https://news.ycombinator.com/item?id=8767909 |
|
16:49
🔗
|
|
primus104 has quit IRC (Leaving.) |
|
16:51
🔗
|
|
aaaaaaaaa has joined #archiveteam |
|
16:54
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
|
17:11
🔗
|
|
mistym has joined #archiveteam |
|
17:29
🔗
|
|
primus104 has joined #archiveteam |
|
18:22
🔗
|
|
K4k has joined #archiveteam |
|
18:33
🔗
|
raylee |
SketchCow: youre porting dosbox to the browser? |
|
19:16
🔗
|
SketchCow |
Someone already has done it, and it's done. |
|
19:16
🔗
|
SketchCow |
Now I'm just trying to make it work with the archive.org structure, which has some unusual aspects. |
|
19:39
🔗
|
|
BlueMaxim has joined #archiveteam |
|
20:17
🔗
|
|
APerti has joined #archiveteam |
|
20:29
🔗
|
|
Ravenloft has joined #archiveteam |
|
20:35
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
|
20:53
🔗
|
|
Start has joined #archiveteam |
|
21:01
🔗
|
|
mistym has joined #archiveteam |
|
21:06
🔗
|
|
primus104 has quit IRC (Leaving.) |
|
21:10
🔗
|
joepie91 |
no data lost, but another Yahoo acquisition apparently |
|
21:10
🔗
|
joepie91 |
https://peercdn.com/ |
|
21:10
🔗
|
joepie91 |
PeerCDN Acquired by Yahoo! |
|
21:13
🔗
|
|
K4k has quit IRC (WeeChat 1.0.1) |
|
21:22
🔗
|
|
Start has quit IRC (Ping timeout: 365 seconds) |
|
21:23
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
|
21:27
🔗
|
|
Start has joined #archiveteam |
|
21:29
🔗
|
deathy |
that's since quite a few months actually |
|
21:32
🔗
|
deathy |
one of the founders started webtorrent I think right after. BitTorrent just using a browser |
|
21:40
🔗
|
|
Start has quit IRC (Quit: Leaving) |
|
21:40
🔗
|
godane |
i'm starting to upload more funny or die videos |
|
21:42
🔗
|
|
schbirid has quit IRC (Read error: Operation timed out) |
|
21:46
🔗
|
|
mistym has joined #archiveteam |
|
21:52
🔗
|
|
schbirid has joined #archiveteam |
|
21:53
🔗
|
|
sankin has quit IRC (Leaving.) |
|
21:58
🔗
|
|
primus104 has joined #archiveteam |
|
22:03
🔗
|
|
ruukasu has joined #archiveteam |
|
22:54
🔗
|
|
schbirid has quit IRC (Leaving) |
|
23:11
🔗
|
godane |
so nine to noon show on radionz is about 11gb a year |
|
23:21
🔗
|
godane |
good news is at this rate i will have the backlog of that show in the archive soon |
|
23:22
🔗
|
godane |
and then i just have to wait for christmas eve to start downloading the index of 2014 urls for that show |
|
23:22
🔗
|
godane |
they end on christmas eve and don't start back until jan ~20 |
|
23:39
🔗
|
|
rejon has joined #archiveteam |
|
23:43
🔗
|
|
APerti has quit IRC (Ping timeout: 370 seconds) |