Time |
Nickname |
Message |
00:07
🔗
|
JesseW |
X done: 3.5-> 1.3 |
00:12
🔗
|
|
JetBalsa has quit IRC (Read error: Operation timed out) |
00:13
🔗
|
|
JetBalsa has joined #archiveteam-bs |
00:22
🔗
|
HCross |
Anyone got an rsync target we could use for the livejournal discovery? Only 5GB or so at max I think, mainly because my target is dying |
00:35
🔗
|
JesseW |
Y done: 6 -> 2.2 |
00:35
🔗
|
bsmith093 |
HCross yo |
00:36
🔗
|
bsmith093 |
HCross: i have plenty of space, i'm in |
00:37
🔗
|
HCross |
bsmith093, pm your target to arkiver please |
00:38
🔗
|
JesseW |
Z done: 0.333 -> 0.121 |
00:38
🔗
|
JesseW |
Total (excluding H, N, T and misc) is only 75G -- zip compression works well, apparently. |
00:42
🔗
|
JesseW |
Now pushing them up to FOS |
00:43
🔗
|
bsmith093 |
JesseW: damn, that much better |
00:43
🔗
|
bsmith093 |
JesseW: how do i set myself up as an rsync target? |
00:44
🔗
|
JesseW |
well, don't forget that there's about 100GB remaining in those last 3 letters |
00:44
🔗
|
JesseW |
bsmith093: IDK -- I haven't done this. Probably ask arkiver or HCross |
00:45
🔗
|
HCross |
http://www.archiveteam.org/index.php?title=Dev/Staging everything until Megawarc factory |
00:47
🔗
|
bsmith093 |
HCross thanks |
00:53
🔗
|
bsmith093 |
HCross arkiver ready |
00:54
🔗
|
HCross |
I think hes asleep now, but will get it tomorrow |
00:54
🔗
|
bsmith093 |
HCross can you test if it works? |
00:55
🔗
|
HCross |
Sure |
00:56
🔗
|
HCross |
PM me the info |
01:26
🔗
|
|
balrog has quit IRC (Bye) |
01:37
🔗
|
|
balrog has joined #archiveteam-bs |
01:37
🔗
|
|
swebb sets mode: +o balrog |
02:25
🔗
|
JesseW |
OK, now generating extracting the metadata from the whole grab |
02:29
🔗
|
yipdw |
is there anyone here using debian sid, and if so, how unstable i it |
02:29
🔗
|
yipdw |
Kubuntu's inability to reliably reboot or restore touchpad settings has finally pissed me off |
02:40
🔗
|
|
Microguru has joined #archiveteam-bs |
02:47
🔗
|
bsmith093 |
HCross i could just pull the data using rsync, you dont have to push |
03:08
🔗
|
|
bwn has quit IRC (Ping timeout: 492 seconds) |
03:17
🔗
|
|
ppsym has joined #archiveteam-bs |
03:18
🔗
|
|
altlabel has quit IRC (Ping timeout: 258 seconds) |
03:21
🔗
|
|
PurpleSym has quit IRC (Ping timeout: 506 seconds) |
03:23
🔗
|
|
ppsym is now known as PurpleSym |
03:36
🔗
|
JesseW |
Microguru: yeah, #archiveteam uses IRC slightly unusually, I think. |
03:37
🔗
|
Microguru |
We kinda need to, considering that this is where most of the coordination happens. |
03:46
🔗
|
|
PotcFdk has quit IRC (Remote host closed the connection) |
03:49
🔗
|
Microguru |
there's a post on the AT wiki about how to use statistics to estimate the number of pages on a site using repeated clicks of the random button or something like that. I have a site I'm starting to gather data on for archival, and I want to verify my previous estimate. I can't find that page. anyone know what it was again |
03:49
🔗
|
|
PotcFdk has joined #archiveteam-bs |
03:50
🔗
|
xmc |
huh interesting |
03:50
🔗
|
xmc |
sounds like a straightforward application of the math used to solve the German Tank Problem |
03:50
🔗
|
Microguru |
thank you. that's what I was thinking of. |
03:55
🔗
|
JesseW |
well, I'll be uploading stuff to FOS for a while -- I have about 75GB to upload, and I'm getting ~ 0.3MB/s. :-/ |
04:01
🔗
|
|
PurpleSym has quit IRC (*) |
04:01
🔗
|
|
ppsym has joined #archiveteam-bs |
04:01
🔗
|
|
ppsym is now known as PurpleSym |
04:11
🔗
|
|
bwn has joined #archiveteam-bs |
04:41
🔗
|
|
JesseW has quit IRC (Quit: Leaving.) |
04:44
🔗
|
godane |
SketchCow: i'm up to 2012-05-15 with kpfa |
04:44
🔗
|
godane |
i think i went thur 2 months in one day |
04:45
🔗
|
godane |
and you may get a third before i go to bed |
05:07
🔗
|
bsmith093 |
JesseW 75GB/300KBps = 250 000 seconds or about 2.894 days |
05:10
🔗
|
yipdw |
maybe I've just been watching a lot of Star Trek lately, but a lot of the recent chatter sounds Vulcan |
05:10
🔗
|
yipdw |
and I don't mean that in a positive way |
05:13
🔗
|
Frogging |
wat |
05:15
🔗
|
yipdw |
it's probably just the Star Trek, carry on |
05:32
🔗
|
|
JesseW has joined #archiveteam-bs |
05:54
🔗
|
bsmith093 |
JesseW: how goes the csv script? |
05:56
🔗
|
|
Sk1d has quit IRC (Ping timeout: 194 seconds) |
05:57
🔗
|
JesseW |
bsmith093: currently crunching through the 187,361 files in Naruto |
05:57
🔗
|
JesseW |
seems to be working well |
05:57
🔗
|
JesseW |
Fanfiction_B.zip has an ETA of about 3 more hours. |
05:58
🔗
|
bsmith093 |
jesus, that's almost athrid as poopular as harry potter with 600K |
05:58
🔗
|
bsmith093 |
wow, typos :P |
05:59
🔗
|
bsmith093 |
JesseW: so all the zips together , how big? |
06:01
🔗
|
JesseW |
75GB |
06:01
🔗
|
JesseW |
(and remember, I still don't have the last big 3, which are about ~100GB (uncompressed)) |
06:02
🔗
|
|
Sk1d has joined #archiveteam-bs |
06:04
🔗
|
|
JesseW has quit IRC (Quit: Leaving.) |
06:04
🔗
|
bsmith093 |
oh, right |
06:16
🔗
|
Frogging |
JesseW needs a bouncer :p |
06:16
🔗
|
godane |
you guys maybe getting a sean hannity collection at some point |
06:32
🔗
|
|
bwn has quit IRC (Ping timeout: 492 seconds) |
06:51
🔗
|
|
metalcamp has joined #archiveteam-bs |
07:31
🔗
|
|
bwn has joined #archiveteam-bs |
07:48
🔗
|
|
metalcamp has quit IRC (Ping timeout: 244 seconds) |
08:04
🔗
|
|
bsmith093 has quit IRC (Ping timeout: 370 seconds) |
08:19
🔗
|
|
bsmith093 has joined #archiveteam-bs |
08:30
🔗
|
godane |
SketchCow: looks like i found more Premiere Interactive radio shows |
08:31
🔗
|
godane |
one called Jim Rome and it goes back to 2005 |
08:31
🔗
|
godane |
and is open |
08:40
🔗
|
godane |
so Jim Rome Show is a sports show |
08:40
🔗
|
|
DFJustin has quit IRC (Read error: Connection reset by peer) |
08:40
🔗
|
|
DFJustin has joined #archiveteam-bs |
08:40
🔗
|
|
swebb sets mode: +o DFJustin |
08:40
🔗
|
godane |
plus side we will get a intervew with Armstrong on jan 3 2005 hour 3 |
08:54
🔗
|
|
superkuh has joined #archiveteam-bs |
09:06
🔗
|
|
lytv has quit IRC (Ping timeout: 244 seconds) |
09:07
🔗
|
|
JetBalsa has quit IRC (Read error: Operation timed out) |
09:07
🔗
|
|
lytv has joined #archiveteam-bs |
09:07
🔗
|
|
JetBalsa has joined #archiveteam-bs |
09:11
🔗
|
|
schbirid has joined #archiveteam-bs |
09:32
🔗
|
|
RichardG has joined #archiveteam-bs |
09:56
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
10:18
🔗
|
|
brayden has quit IRC (Quit: Leaving) |
10:19
🔗
|
|
brayden has joined #archiveteam-bs |
10:19
🔗
|
|
swebb sets mode: +o brayden |
11:37
🔗
|
|
dan- has quit IRC (Quit: Nyan nyan) |
12:30
🔗
|
|
HCross2 has quit IRC () |
12:34
🔗
|
|
dan- has joined #archiveteam-bs |
12:53
🔗
|
|
metalcamp has joined #archiveteam-bs |
13:15
🔗
|
|
HCross2 has joined #archiveteam-bs |
13:20
🔗
|
HCross |
bsmith093, ping |
14:02
🔗
|
|
pgoetz has quit IRC (Remote host closed the connection) |
14:19
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
15:34
🔗
|
|
Start has joined #archiveteam-bs |
15:55
🔗
|
|
RichardG has quit IRC (Quit: Keyboard not found, press F1 to continue) |
15:59
🔗
|
|
RichardG has joined #archiveteam-bs |
16:07
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
16:13
🔗
|
|
metalcamp has quit IRC (Ping timeout: 244 seconds) |
16:28
🔗
|
|
JesseW has joined #archiveteam-bs |
16:40
🔗
|
|
metalcamp has joined #archiveteam-bs |
16:49
🔗
|
|
JesseW has quit IRC (Quit: Leaving.) |
17:39
🔗
|
bsmith093 |
HCross pong |
17:41
🔗
|
HCross |
Anyone good at dealing with really strange errors? http://paste.nerds.io/ucimumopul.avrasm |
17:43
🔗
|
|
Start has joined #archiveteam-bs |
17:44
🔗
|
phuzion |
HCross: What's the context of the error? rsyncing to a newsbuddy worker? |
17:44
🔗
|
HCross |
yeah |
17:45
🔗
|
|
metalcamp has quit IRC (Ping timeout: 250 seconds) |
17:47
🔗
|
phuzion |
Is it reproducible? |
17:47
🔗
|
HCross |
Yes, its happening every time it rsyncs to a worker |
17:47
🔗
|
phuzion |
A specific worker? |
17:47
🔗
|
HCross |
all of them |
17:48
🔗
|
phuzion |
Can you reproduce the error with a manual command? |
17:50
🔗
|
phuzion |
Also, perhaps try reinstalling the rsync package through your package manager. |
17:52
🔗
|
|
signius_ has quit IRC (Read error: Operation timed out) |
18:02
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
18:06
🔗
|
|
signius_ has joined #archiveteam-bs |
18:07
🔗
|
HCross |
phuzion, works with other rsyncs manually |
18:10
🔗
|
|
metalcamp has joined #archiveteam-bs |
18:10
🔗
|
|
JW_work has joined #archiveteam-bs |
18:13
🔗
|
phuzion |
HCross: Can you figure out the EXACT command that is being issued by the newsbuddy script and replicate that on the command line? |
18:15
🔗
|
arkiver |
rsync -avz --no-o --no-g --progress --remove-source-files list-videos_temp" + str(i) + " " + rsync_targets[i] |
18:15
🔗
|
HCross |
when sourceforge responds |
18:15
🔗
|
HCross |
will test, ty |
18:16
🔗
|
HCross |
worked fine on the command line |
18:17
🔗
|
phuzion |
Strange. |
18:17
🔗
|
arkiver |
well, try reinstalling |
18:17
🔗
|
arkiver |
rsync ^ |
18:17
🔗
|
HCross |
ok, can you pause the livejournal stuff please arkiver |
18:18
🔗
|
|
Start has joined #archiveteam-bs |
18:18
🔗
|
arkiver |
paused. what's the problem with it? |
18:18
🔗
|
HCross |
its going to the same server, thats having the rsync issues |
18:21
🔗
|
HCross |
reinstalled, now we wait |
18:26
🔗
|
HCross |
arkiver, ready for an unpause |
18:27
🔗
|
arkiver |
right |
18:27
🔗
|
arkiver |
do !start |
18:28
🔗
|
HCross |
on livejournal? |
18:28
🔗
|
arkiver |
oh sorry |
18:29
🔗
|
arkiver |
to unpause newsbuddy do !start |
18:29
🔗
|
arkiver |
I'll restart livejournal |
18:29
🔗
|
HCross |
I didnt pause it |
18:29
🔗
|
arkiver |
restarted! |
18:29
🔗
|
HCross |
I timed it so it wasnt rsyncing out |
18:29
🔗
|
arkiver |
Right |
18:30
🔗
|
bsmith093 |
HCross so am I still needed? |
18:34
🔗
|
HCross |
I should ask - any rsync experts around to give bsmith093 a hand? Trying to get an rsync target setup and his network is being strange |
18:35
🔗
|
JW_work |
Could someone add https://github.com/matteobrusa/TumblrToStaticExporter to the wiki page for Tumblr? |
18:38
🔗
|
bsmith093 |
o know port forwarding works, i have another one open just fine. |
18:43
🔗
|
bsmith093 |
arkiver: help with my rsyns settings please, i cant seem to get the port to open on my linksys ea6500 router |
18:44
🔗
|
HCross |
phuzion, its still doing it http://paste.nerds.io/owomiyunub.erl |
18:48
🔗
|
phuzion |
HCross: rsync --version |
18:49
🔗
|
HCross |
rsync version 3.1.1 protocol version 31 |
18:49
🔗
|
bsmith093 |
phuzion: rsync version 3.1.1 protocol version 31 |
18:49
🔗
|
bsmith093 |
ditto for me |
18:49
🔗
|
phuzion |
bsmith093: Are you experiencing the same error that HCross is? |
18:50
🔗
|
bsmith093 |
phuzion: he's the other end of the connection i'm trying to make |
18:50
🔗
|
HCross |
phuzion, hes having NAT issues with getting files in, he isnt having the issue |
18:50
🔗
|
HCross |
bsmith093, im talking about something different |
18:50
🔗
|
phuzion |
Oh. Let's troubleshoot one thing at a time. |
18:50
🔗
|
HCross |
yeah, good plan |
18:50
🔗
|
bsmith093 |
HCross sorry |
18:50
🔗
|
phuzion |
HCross: what distro? |
18:50
🔗
|
HCross |
Debian 8 |
18:51
🔗
|
HCross |
phuzion, works fine on command line, when newsbuddy runs it, it falls over |
18:53
🔗
|
bsmith093 |
HCross anyway, now port 873 tests as open, so go nuts |
18:59
🔗
|
phuzion |
HCross: are all of your python packages installed with apt or do you have some that were installed via easy_install or pip? |
18:59
🔗
|
HCross |
phuzion, pip |
19:00
🔗
|
HCross |
its worked fine for ages, and then suddenly brokwe |
19:00
🔗
|
HCross |
broke |
19:00
🔗
|
phuzion |
pip freeze? |
19:00
🔗
|
HCross |
http://paste.nerds.io/cukamufuwo.mel |
19:02
🔗
|
phuzion |
Honestly, my next suggestion is pip --force-reinstall all those packages |
19:02
🔗
|
phuzion |
( -r requirements.txt obviously) |
19:02
🔗
|
HCross |
all those work fine, its just rsync being a pest |
19:02
🔗
|
phuzion |
well, you reinstalled rsync |
19:03
🔗
|
HCross |
ok, ill try that |
19:03
🔗
|
phuzion |
so, you theoretically know that the rsync binaries aren't corrupted or anything. |
19:03
🔗
|
phuzion |
and the problem doesn't happen when you run the command from a shell (I assume bash?) |
19:03
🔗
|
HCross |
yeah |
19:04
🔗
|
phuzion |
So, it leads me to believe that it's either a problem with your python environment, or a bug in the newsbuddy code. |
19:05
🔗
|
HCross |
There was an update recently |
19:05
🔗
|
phuzion |
Which packages updated? |
19:05
🔗
|
HCross |
I meant a code update |
19:05
🔗
|
phuzion |
Oh. |
19:05
🔗
|
HCross |
Thats probably killed it |
19:06
🔗
|
HCross |
phuzion, thanks for dealing with my noobishness |
19:08
🔗
|
phuzion |
No problem. If you suspect you know which commit caused the problem, find the one before it and do 'git checkout abcd1234' or whatever the first 6-8 chars of the commit ID are, and try re-running the code. |
19:16
🔗
|
|
metalcamp has quit IRC (Ping timeout: 244 seconds) |
19:23
🔗
|
|
Honno has joined #archiveteam-bs |
19:40
🔗
|
|
bwn has quit IRC (Ping timeout: 246 seconds) |
19:43
🔗
|
|
tomwsmf-a has joined #archiveteam-bs |
19:44
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
20:14
🔗
|
|
bwn has joined #archiveteam-bs |
20:42
🔗
|
Honno |
lo all, I wanna do a games torrent in the excess of 600GB. how the hell do I get the ball rolling? Need a server to start seeding it at decent rates right? What's a good, cost effective service for that? |
20:43
🔗
|
Honno |
Games torrent as in, a collection of games |
20:44
🔗
|
Smiley |
o_O??? |
20:44
🔗
|
Honno |
yeah uh |
20:44
🔗
|
JW_work |
IA? |
20:44
🔗
|
Smiley |
they all freeware? |
20:44
🔗
|
JW_work |
Internet Archive, I mean? |
20:44
🔗
|
Honno |
yeah all freeware games |
20:44
🔗
|
Smiley |
oooo |
20:44
🔗
|
Smiley |
nice |
20:44
🔗
|
Honno |
From http://www.archiveteam.org/index.php?title=GameMaker_Sandbox |
20:44
🔗
|
JW_work |
I think it doesn't handle torrents that large, though. |
20:45
🔗
|
Honno |
IA does torrent hosting heh? |
20:45
🔗
|
JW_work |
All items on IA are also available as torrents, yes. |
20:45
🔗
|
JW_work |
And pretty much anywhere can act as a webseed |
20:45
🔗
|
JW_work |
That's the point of webseeds |
20:45
🔗
|
Honno |
Anyway, I'm just looking into the warcs and stuff, what I want to do is compile all the game downloads into their own seperate folders, with a text file that contains meta data I strip from the archive |
20:45
🔗
|
xmc |
JW_work: your presence would be good in #urlteam if you have a moment |
20:46
🔗
|
Honno |
Interested in how that works JM_work |
20:46
🔗
|
Honno |
JW even, too tired, been downloading games all day lol |
20:48
🔗
|
Honno |
Er there is a concern I haven't really thought about |
20:48
🔗
|
Honno |
We love archives and all right, but we archive in their original form |
20:48
🔗
|
Honno |
What I'd be doing is taking thousands of developers work, even if free, and compiling them to a bundle |
20:49
🔗
|
Honno |
They're all freeware btw |
20:49
🔗
|
Honno |
The problem is, using the Internet Archive to browse the site is slow as hell, and using these big warcs is pretty hard for newbies |
20:50
🔗
|
Honno |
I also think just some massive torrent of games will gain some traction in communities, which is important because I want people to really get some enjoyment out of this stuff |
20:51
🔗
|
Honno |
First off, ethically how do you guys think about that, and secondly am I breaking some legal stuff |
20:52
🔗
|
Honno |
It's unreasonable to contact all the devs for their consent, I would image most if not all wouldn't mind this project however |
20:52
🔗
|
Honno |
I could use an opt-out system, where devs can contact me to remove their game from this listing, but then I can't really use torrents because they would be variable to be redundant |
20:52
🔗
|
Frogging |
oh man the geocities torrent died? Where is the data now then? |
20:53
🔗
|
JW_work |
Frogging: in lots of places — neocities (AFAIK), on IA (I am pretty certain), likely many other places too. |
20:53
🔗
|
JW_work |
Just none of them happen to be seeding the original format of the torrent |
20:54
🔗
|
Frogging |
ah |
20:54
🔗
|
xmc |
https://archive.org/details/2009-archiveteam-geocities-part1 |
20:57
🔗
|
Frogging |
thanks |
21:08
🔗
|
bsmith093 |
HCross it's working now |
21:10
🔗
|
|
RichardG has quit IRC (Ping timeout: 260 seconds) |
21:16
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
21:26
🔗
|
|
RichardG has joined #archiveteam-bs |
21:36
🔗
|
|
xXx_ndidd has joined #archiveteam-bs |
21:43
🔗
|
|
ndiddy has quit IRC (Read error: Operation timed out) |
21:48
🔗
|
|
xXx_ndidd is now known as ndiddy |
22:04
🔗
|
godane |
SketchCow: kpfa is up to 2012-06-30 |
22:07
🔗
|
SketchCow |
Thank you! |
22:07
🔗
|
SketchCow |
There's a redone one |
22:08
🔗
|
SketchCow |
(Geocities torrent) |
22:08
🔗
|
SketchCow |
From Dragan. |
22:09
🔗
|
|
RichardG has quit IRC (Read error: Connection reset by peer) |
22:12
🔗
|
godane |
i'm starting to upload The Jim Rome Show: https://archive.org/details/The_Jim_Rome_Show_Podcast-2005-01-03 |
22:12
🔗
|
|
RichardG has joined #archiveteam-bs |
22:25
🔗
|
|
RichardG has quit IRC (Ping timeout: 244 seconds) |
22:31
🔗
|
|
Honno has quit IRC (Ping timeout: 492 seconds) |
22:33
🔗
|
dashcloud |
Microguru: there are captcha solving plugins & services- check out the list from plowshare: https://github.com/mcrapet/plowshare/blob/master/docs/plowdown.1 |
22:33
🔗
|
|
RichardG has joined #archiveteam-bs |
23:38
🔗
|
|
Start has joined #archiveteam-bs |
23:54
🔗
|
HCross |
phuzion, thanks. We've jumped back to an older version and it seems to be working |
23:56
🔗
|
JW_work |
note to self (or SketchCow) — the last comment on http://ascii.textfiles.com/archives/875 is spam. |
23:58
🔗
|
|
RedType_ has left |