| Time |
Nickname |
Message |
|
00:07
🔗
|
JesseW |
X done: 3.5-> 1.3 |
|
00:12
🔗
|
|
JetBalsa has quit IRC (Read error: Operation timed out) |
|
00:13
🔗
|
|
JetBalsa has joined #archiveteam-bs |
|
00:22
🔗
|
HCross |
Anyone got an rsync target we could use for the livejournal discovery? Only 5GB or so at max I think, mainly because my target is dying |
|
00:35
🔗
|
JesseW |
Y done: 6 -> 2.2 |
|
00:35
🔗
|
bsmith093 |
HCross yo |
|
00:36
🔗
|
bsmith093 |
HCross: i have plenty of space, i'm in |
|
00:37
🔗
|
HCross |
bsmith093, pm your target to arkiver please |
|
00:38
🔗
|
JesseW |
Z done: 0.333 -> 0.121 |
|
00:38
🔗
|
JesseW |
Total (excluding H, N, T and misc) is only 75G -- zip compression works well, apparently. |
|
00:42
🔗
|
JesseW |
Now pushing them up to FOS |
|
00:43
🔗
|
bsmith093 |
JesseW: damn, that much better |
|
00:43
🔗
|
bsmith093 |
JesseW: how do i set myself up as an rsync target? |
|
00:44
🔗
|
JesseW |
well, don't forget that there's about 100GB remaining in those last 3 letters |
|
00:44
🔗
|
JesseW |
bsmith093: IDK -- I haven't done this. Probably ask arkiver or HCross |
|
00:45
🔗
|
HCross |
http://www.archiveteam.org/index.php?title=Dev/Staging everything until Megawarc factory |
|
00:47
🔗
|
bsmith093 |
HCross thanks |
|
00:53
🔗
|
bsmith093 |
HCross arkiver ready |
|
00:54
🔗
|
HCross |
I think hes asleep now, but will get it tomorrow |
|
00:54
🔗
|
bsmith093 |
HCross can you test if it works? |
|
00:55
🔗
|
HCross |
Sure |
|
00:56
🔗
|
HCross |
PM me the info |
|
01:26
🔗
|
|
balrog has quit IRC (Bye) |
|
01:37
🔗
|
|
balrog has joined #archiveteam-bs |
|
01:37
🔗
|
|
swebb sets mode: +o balrog |
|
02:25
🔗
|
JesseW |
OK, now generating extracting the metadata from the whole grab |
|
02:29
🔗
|
yipdw |
is there anyone here using debian sid, and if so, how unstable i it |
|
02:29
🔗
|
yipdw |
Kubuntu's inability to reliably reboot or restore touchpad settings has finally pissed me off |
|
02:40
🔗
|
|
Microguru has joined #archiveteam-bs |
|
02:47
🔗
|
bsmith093 |
HCross i could just pull the data using rsync, you dont have to push |
|
03:08
🔗
|
|
bwn has quit IRC (Ping timeout: 492 seconds) |
|
03:17
🔗
|
|
ppsym has joined #archiveteam-bs |
|
03:18
🔗
|
|
altlabel has quit IRC (Ping timeout: 258 seconds) |
|
03:21
🔗
|
|
PurpleSym has quit IRC (Ping timeout: 506 seconds) |
|
03:23
🔗
|
|
ppsym is now known as PurpleSym |
|
03:36
🔗
|
JesseW |
Microguru: yeah, #archiveteam uses IRC slightly unusually, I think. |
|
03:37
🔗
|
Microguru |
We kinda need to, considering that this is where most of the coordination happens. |
|
03:46
🔗
|
|
PotcFdk has quit IRC (Remote host closed the connection) |
|
03:49
🔗
|
Microguru |
there's a post on the AT wiki about how to use statistics to estimate the number of pages on a site using repeated clicks of the random button or something like that. I have a site I'm starting to gather data on for archival, and I want to verify my previous estimate. I can't find that page. anyone know what it was again |
|
03:49
🔗
|
|
PotcFdk has joined #archiveteam-bs |
|
03:50
🔗
|
xmc |
huh interesting |
|
03:50
🔗
|
xmc |
sounds like a straightforward application of the math used to solve the German Tank Problem |
|
03:50
🔗
|
Microguru |
thank you. that's what I was thinking of. |
|
03:55
🔗
|
JesseW |
well, I'll be uploading stuff to FOS for a while -- I have about 75GB to upload, and I'm getting ~ 0.3MB/s. :-/ |
|
04:01
🔗
|
|
PurpleSym has quit IRC (*) |
|
04:01
🔗
|
|
ppsym has joined #archiveteam-bs |
|
04:01
🔗
|
|
ppsym is now known as PurpleSym |
|
04:11
🔗
|
|
bwn has joined #archiveteam-bs |
|
04:41
🔗
|
|
JesseW has quit IRC (Quit: Leaving.) |
|
04:44
🔗
|
godane |
SketchCow: i'm up to 2012-05-15 with kpfa |
|
04:44
🔗
|
godane |
i think i went thur 2 months in one day |
|
04:45
🔗
|
godane |
and you may get a third before i go to bed |
|
05:07
🔗
|
bsmith093 |
JesseW 75GB/300KBps = 250 000 seconds or about 2.894 days |
|
05:10
🔗
|
yipdw |
maybe I've just been watching a lot of Star Trek lately, but a lot of the recent chatter sounds Vulcan |
|
05:10
🔗
|
yipdw |
and I don't mean that in a positive way |
|
05:13
🔗
|
Frogging |
wat |
|
05:15
🔗
|
yipdw |
it's probably just the Star Trek, carry on |
|
05:32
🔗
|
|
JesseW has joined #archiveteam-bs |
|
05:54
🔗
|
bsmith093 |
JesseW: how goes the csv script? |
|
05:56
🔗
|
|
Sk1d has quit IRC (Ping timeout: 194 seconds) |
|
05:57
🔗
|
JesseW |
bsmith093: currently crunching through the 187,361 files in Naruto |
|
05:57
🔗
|
JesseW |
seems to be working well |
|
05:57
🔗
|
JesseW |
Fanfiction_B.zip has an ETA of about 3 more hours. |
|
05:58
🔗
|
bsmith093 |
jesus, that's almost athrid as poopular as harry potter with 600K |
|
05:58
🔗
|
bsmith093 |
wow, typos :P |
|
05:59
🔗
|
bsmith093 |
JesseW: so all the zips together , how big? |
|
06:01
🔗
|
JesseW |
75GB |
|
06:01
🔗
|
JesseW |
(and remember, I still don't have the last big 3, which are about ~100GB (uncompressed)) |
|
06:02
🔗
|
|
Sk1d has joined #archiveteam-bs |
|
06:04
🔗
|
|
JesseW has quit IRC (Quit: Leaving.) |
|
06:04
🔗
|
bsmith093 |
oh, right |
|
06:16
🔗
|
Frogging |
JesseW needs a bouncer :p |
|
06:16
🔗
|
godane |
you guys maybe getting a sean hannity collection at some point |
|
06:32
🔗
|
|
bwn has quit IRC (Ping timeout: 492 seconds) |
|
06:51
🔗
|
|
metalcamp has joined #archiveteam-bs |
|
07:31
🔗
|
|
bwn has joined #archiveteam-bs |
|
07:48
🔗
|
|
metalcamp has quit IRC (Ping timeout: 244 seconds) |
|
08:04
🔗
|
|
bsmith093 has quit IRC (Ping timeout: 370 seconds) |
|
08:19
🔗
|
|
bsmith093 has joined #archiveteam-bs |
|
08:30
🔗
|
godane |
SketchCow: looks like i found more Premiere Interactive radio shows |
|
08:31
🔗
|
godane |
one called Jim Rome and it goes back to 2005 |
|
08:31
🔗
|
godane |
and is open |
|
08:40
🔗
|
godane |
so Jim Rome Show is a sports show |
|
08:40
🔗
|
|
DFJustin has quit IRC (Read error: Connection reset by peer) |
|
08:40
🔗
|
|
DFJustin has joined #archiveteam-bs |
|
08:40
🔗
|
|
swebb sets mode: +o DFJustin |
|
08:40
🔗
|
godane |
plus side we will get a intervew with Armstrong on jan 3 2005 hour 3 |
|
08:54
🔗
|
|
superkuh has joined #archiveteam-bs |
|
09:06
🔗
|
|
lytv has quit IRC (Ping timeout: 244 seconds) |
|
09:07
🔗
|
|
JetBalsa has quit IRC (Read error: Operation timed out) |
|
09:07
🔗
|
|
lytv has joined #archiveteam-bs |
|
09:07
🔗
|
|
JetBalsa has joined #archiveteam-bs |
|
09:11
🔗
|
|
schbirid has joined #archiveteam-bs |
|
09:32
🔗
|
|
RichardG has joined #archiveteam-bs |
|
09:56
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
|
10:18
🔗
|
|
brayden has quit IRC (Quit: Leaving) |
|
10:19
🔗
|
|
brayden has joined #archiveteam-bs |
|
10:19
🔗
|
|
swebb sets mode: +o brayden |
|
11:37
🔗
|
|
dan- has quit IRC (Quit: Nyan nyan) |
|
12:30
🔗
|
|
HCross2 has quit IRC () |
|
12:34
🔗
|
|
dan- has joined #archiveteam-bs |
|
12:53
🔗
|
|
metalcamp has joined #archiveteam-bs |
|
13:15
🔗
|
|
HCross2 has joined #archiveteam-bs |
|
13:20
🔗
|
HCross |
bsmith093, ping |
|
14:02
🔗
|
|
pgoetz has quit IRC (Remote host closed the connection) |
|
14:19
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
|
15:34
🔗
|
|
Start has joined #archiveteam-bs |
|
15:55
🔗
|
|
RichardG has quit IRC (Quit: Keyboard not found, press F1 to continue) |
|
15:59
🔗
|
|
RichardG has joined #archiveteam-bs |
|
16:07
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
|
16:13
🔗
|
|
metalcamp has quit IRC (Ping timeout: 244 seconds) |
|
16:28
🔗
|
|
JesseW has joined #archiveteam-bs |
|
16:40
🔗
|
|
metalcamp has joined #archiveteam-bs |
|
16:49
🔗
|
|
JesseW has quit IRC (Quit: Leaving.) |
|
17:39
🔗
|
bsmith093 |
HCross pong |
|
17:41
🔗
|
HCross |
Anyone good at dealing with really strange errors? http://paste.nerds.io/ucimumopul.avrasm |
|
17:43
🔗
|
|
Start has joined #archiveteam-bs |
|
17:44
🔗
|
phuzion |
HCross: What's the context of the error? rsyncing to a newsbuddy worker? |
|
17:44
🔗
|
HCross |
yeah |
|
17:45
🔗
|
|
metalcamp has quit IRC (Ping timeout: 250 seconds) |
|
17:47
🔗
|
phuzion |
Is it reproducible? |
|
17:47
🔗
|
HCross |
Yes, its happening every time it rsyncs to a worker |
|
17:47
🔗
|
phuzion |
A specific worker? |
|
17:47
🔗
|
HCross |
all of them |
|
17:48
🔗
|
phuzion |
Can you reproduce the error with a manual command? |
|
17:50
🔗
|
phuzion |
Also, perhaps try reinstalling the rsync package through your package manager. |
|
17:52
🔗
|
|
signius_ has quit IRC (Read error: Operation timed out) |
|
18:02
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
|
18:06
🔗
|
|
signius_ has joined #archiveteam-bs |
|
18:07
🔗
|
HCross |
phuzion, works with other rsyncs manually |
|
18:10
🔗
|
|
metalcamp has joined #archiveteam-bs |
|
18:10
🔗
|
|
JW_work has joined #archiveteam-bs |
|
18:13
🔗
|
phuzion |
HCross: Can you figure out the EXACT command that is being issued by the newsbuddy script and replicate that on the command line? |
|
18:15
🔗
|
arkiver |
rsync -avz --no-o --no-g --progress --remove-source-files list-videos_temp" + str(i) + " " + rsync_targets[i] |
|
18:15
🔗
|
HCross |
when sourceforge responds |
|
18:15
🔗
|
HCross |
will test, ty |
|
18:16
🔗
|
HCross |
worked fine on the command line |
|
18:17
🔗
|
phuzion |
Strange. |
|
18:17
🔗
|
arkiver |
well, try reinstalling |
|
18:17
🔗
|
arkiver |
rsync ^ |
|
18:17
🔗
|
HCross |
ok, can you pause the livejournal stuff please arkiver |
|
18:18
🔗
|
|
Start has joined #archiveteam-bs |
|
18:18
🔗
|
arkiver |
paused. what's the problem with it? |
|
18:18
🔗
|
HCross |
its going to the same server, thats having the rsync issues |
|
18:21
🔗
|
HCross |
reinstalled, now we wait |
|
18:26
🔗
|
HCross |
arkiver, ready for an unpause |
|
18:27
🔗
|
arkiver |
right |
|
18:27
🔗
|
arkiver |
do !start |
|
18:28
🔗
|
HCross |
on livejournal? |
|
18:28
🔗
|
arkiver |
oh sorry |
|
18:29
🔗
|
arkiver |
to unpause newsbuddy do !start |
|
18:29
🔗
|
arkiver |
I'll restart livejournal |
|
18:29
🔗
|
HCross |
I didnt pause it |
|
18:29
🔗
|
arkiver |
restarted! |
|
18:29
🔗
|
HCross |
I timed it so it wasnt rsyncing out |
|
18:29
🔗
|
arkiver |
Right |
|
18:30
🔗
|
bsmith093 |
HCross so am I still needed? |
|
18:34
🔗
|
HCross |
I should ask - any rsync experts around to give bsmith093 a hand? Trying to get an rsync target setup and his network is being strange |
|
18:35
🔗
|
JW_work |
Could someone add https://github.com/matteobrusa/TumblrToStaticExporter to the wiki page for Tumblr? |
|
18:38
🔗
|
bsmith093 |
o know port forwarding works, i have another one open just fine. |
|
18:43
🔗
|
bsmith093 |
arkiver: help with my rsyns settings please, i cant seem to get the port to open on my linksys ea6500 router |
|
18:44
🔗
|
HCross |
phuzion, its still doing it http://paste.nerds.io/owomiyunub.erl |
|
18:48
🔗
|
phuzion |
HCross: rsync --version |
|
18:49
🔗
|
HCross |
rsync version 3.1.1 protocol version 31 |
|
18:49
🔗
|
bsmith093 |
phuzion: rsync version 3.1.1 protocol version 31 |
|
18:49
🔗
|
bsmith093 |
ditto for me |
|
18:49
🔗
|
phuzion |
bsmith093: Are you experiencing the same error that HCross is? |
|
18:50
🔗
|
bsmith093 |
phuzion: he's the other end of the connection i'm trying to make |
|
18:50
🔗
|
HCross |
phuzion, hes having NAT issues with getting files in, he isnt having the issue |
|
18:50
🔗
|
HCross |
bsmith093, im talking about something different |
|
18:50
🔗
|
phuzion |
Oh. Let's troubleshoot one thing at a time. |
|
18:50
🔗
|
HCross |
yeah, good plan |
|
18:50
🔗
|
bsmith093 |
HCross sorry |
|
18:50
🔗
|
phuzion |
HCross: what distro? |
|
18:50
🔗
|
HCross |
Debian 8 |
|
18:51
🔗
|
HCross |
phuzion, works fine on command line, when newsbuddy runs it, it falls over |
|
18:53
🔗
|
bsmith093 |
HCross anyway, now port 873 tests as open, so go nuts |
|
18:59
🔗
|
phuzion |
HCross: are all of your python packages installed with apt or do you have some that were installed via easy_install or pip? |
|
18:59
🔗
|
HCross |
phuzion, pip |
|
19:00
🔗
|
HCross |
its worked fine for ages, and then suddenly brokwe |
|
19:00
🔗
|
HCross |
broke |
|
19:00
🔗
|
phuzion |
pip freeze? |
|
19:00
🔗
|
HCross |
http://paste.nerds.io/cukamufuwo.mel |
|
19:02
🔗
|
phuzion |
Honestly, my next suggestion is pip --force-reinstall all those packages |
|
19:02
🔗
|
phuzion |
( -r requirements.txt obviously) |
|
19:02
🔗
|
HCross |
all those work fine, its just rsync being a pest |
|
19:02
🔗
|
phuzion |
well, you reinstalled rsync |
|
19:03
🔗
|
HCross |
ok, ill try that |
|
19:03
🔗
|
phuzion |
so, you theoretically know that the rsync binaries aren't corrupted or anything. |
|
19:03
🔗
|
phuzion |
and the problem doesn't happen when you run the command from a shell (I assume bash?) |
|
19:03
🔗
|
HCross |
yeah |
|
19:04
🔗
|
phuzion |
So, it leads me to believe that it's either a problem with your python environment, or a bug in the newsbuddy code. |
|
19:05
🔗
|
HCross |
There was an update recently |
|
19:05
🔗
|
phuzion |
Which packages updated? |
|
19:05
🔗
|
HCross |
I meant a code update |
|
19:05
🔗
|
phuzion |
Oh. |
|
19:05
🔗
|
HCross |
Thats probably killed it |
|
19:06
🔗
|
HCross |
phuzion, thanks for dealing with my noobishness |
|
19:08
🔗
|
phuzion |
No problem. If you suspect you know which commit caused the problem, find the one before it and do 'git checkout abcd1234' or whatever the first 6-8 chars of the commit ID are, and try re-running the code. |
|
19:16
🔗
|
|
metalcamp has quit IRC (Ping timeout: 244 seconds) |
|
19:23
🔗
|
|
Honno has joined #archiveteam-bs |
|
19:40
🔗
|
|
bwn has quit IRC (Ping timeout: 246 seconds) |
|
19:43
🔗
|
|
tomwsmf-a has joined #archiveteam-bs |
|
19:44
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
|
20:14
🔗
|
|
bwn has joined #archiveteam-bs |
|
20:42
🔗
|
Honno |
lo all, I wanna do a games torrent in the excess of 600GB. how the hell do I get the ball rolling? Need a server to start seeding it at decent rates right? What's a good, cost effective service for that? |
|
20:43
🔗
|
Honno |
Games torrent as in, a collection of games |
|
20:44
🔗
|
Smiley |
o_O??? |
|
20:44
🔗
|
Honno |
yeah uh |
|
20:44
🔗
|
JW_work |
IA? |
|
20:44
🔗
|
Smiley |
they all freeware? |
|
20:44
🔗
|
JW_work |
Internet Archive, I mean? |
|
20:44
🔗
|
Honno |
yeah all freeware games |
|
20:44
🔗
|
Smiley |
oooo |
|
20:44
🔗
|
Smiley |
nice |
|
20:44
🔗
|
Honno |
From http://www.archiveteam.org/index.php?title=GameMaker_Sandbox |
|
20:44
🔗
|
JW_work |
I think it doesn't handle torrents that large, though. |
|
20:45
🔗
|
Honno |
IA does torrent hosting heh? |
|
20:45
🔗
|
JW_work |
All items on IA are also available as torrents, yes. |
|
20:45
🔗
|
JW_work |
And pretty much anywhere can act as a webseed |
|
20:45
🔗
|
JW_work |
That's the point of webseeds |
|
20:45
🔗
|
Honno |
Anyway, I'm just looking into the warcs and stuff, what I want to do is compile all the game downloads into their own seperate folders, with a text file that contains meta data I strip from the archive |
|
20:45
🔗
|
xmc |
JW_work: your presence would be good in #urlteam if you have a moment |
|
20:46
🔗
|
Honno |
Interested in how that works JM_work |
|
20:46
🔗
|
Honno |
JW even, too tired, been downloading games all day lol |
|
20:48
🔗
|
Honno |
Er there is a concern I haven't really thought about |
|
20:48
🔗
|
Honno |
We love archives and all right, but we archive in their original form |
|
20:48
🔗
|
Honno |
What I'd be doing is taking thousands of developers work, even if free, and compiling them to a bundle |
|
20:49
🔗
|
Honno |
They're all freeware btw |
|
20:49
🔗
|
Honno |
The problem is, using the Internet Archive to browse the site is slow as hell, and using these big warcs is pretty hard for newbies |
|
20:50
🔗
|
Honno |
I also think just some massive torrent of games will gain some traction in communities, which is important because I want people to really get some enjoyment out of this stuff |
|
20:51
🔗
|
Honno |
First off, ethically how do you guys think about that, and secondly am I breaking some legal stuff |
|
20:52
🔗
|
Honno |
It's unreasonable to contact all the devs for their consent, I would image most if not all wouldn't mind this project however |
|
20:52
🔗
|
Honno |
I could use an opt-out system, where devs can contact me to remove their game from this listing, but then I can't really use torrents because they would be variable to be redundant |
|
20:52
🔗
|
Frogging |
oh man the geocities torrent died? Where is the data now then? |
|
20:53
🔗
|
JW_work |
Frogging: in lots of places — neocities (AFAIK), on IA (I am pretty certain), likely many other places too. |
|
20:53
🔗
|
JW_work |
Just none of them happen to be seeding the original format of the torrent |
|
20:54
🔗
|
Frogging |
ah |
|
20:54
🔗
|
xmc |
https://archive.org/details/2009-archiveteam-geocities-part1 |
|
20:57
🔗
|
Frogging |
thanks |
|
21:08
🔗
|
bsmith093 |
HCross it's working now |
|
21:10
🔗
|
|
RichardG has quit IRC (Ping timeout: 260 seconds) |
|
21:16
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
|
21:26
🔗
|
|
RichardG has joined #archiveteam-bs |
|
21:36
🔗
|
|
xXx_ndidd has joined #archiveteam-bs |
|
21:43
🔗
|
|
ndiddy has quit IRC (Read error: Operation timed out) |
|
21:48
🔗
|
|
xXx_ndidd is now known as ndiddy |
|
22:04
🔗
|
godane |
SketchCow: kpfa is up to 2012-06-30 |
|
22:07
🔗
|
SketchCow |
Thank you! |
|
22:07
🔗
|
SketchCow |
There's a redone one |
|
22:08
🔗
|
SketchCow |
(Geocities torrent) |
|
22:08
🔗
|
SketchCow |
From Dragan. |
|
22:09
🔗
|
|
RichardG has quit IRC (Read error: Connection reset by peer) |
|
22:12
🔗
|
godane |
i'm starting to upload The Jim Rome Show: https://archive.org/details/The_Jim_Rome_Show_Podcast-2005-01-03 |
|
22:12
🔗
|
|
RichardG has joined #archiveteam-bs |
|
22:25
🔗
|
|
RichardG has quit IRC (Ping timeout: 244 seconds) |
|
22:31
🔗
|
|
Honno has quit IRC (Ping timeout: 492 seconds) |
|
22:33
🔗
|
dashcloud |
Microguru: there are captcha solving plugins & services- check out the list from plowshare: https://github.com/mcrapet/plowshare/blob/master/docs/plowdown.1 |
|
22:33
🔗
|
|
RichardG has joined #archiveteam-bs |
|
23:38
🔗
|
|
Start has joined #archiveteam-bs |
|
23:54
🔗
|
HCross |
phuzion, thanks. We've jumped back to an older version and it seems to be working |
|
23:56
🔗
|
JW_work |
note to self (or SketchCow) — the last comment on http://ascii.textfiles.com/archives/875 is spam. |
|
23:58
🔗
|
|
RedType_ has left |