Time |
Nickname |
Message |
01:47
🔗
|
PatC |
Hey i'm not sure if you guys heared |
01:47
🔗
|
PatC |
"As of January 31, 2012, all waves will be read-only" |
01:47
🔗
|
PatC |
"and the Wave service will be turned off on April 30, 2012" |
01:47
🔗
|
PatC |
via google wave email |
02:04
🔗
|
NotGLaDOS |
We have. |
02:04
🔗
|
NotGLaDOS |
Coderjoe: whoops. |
02:18
🔗
|
bsmith095 |
archiving google wave now?, can i get in on that? |
02:18
🔗
|
PatC |
same |
03:29
🔗
|
Wyatt|Wor |
Looks like splinder is going to be cutting it close. How long do we got? |
03:49
🔗
|
bsmith095 |
id like to know as well if were going to make it |
03:50
🔗
|
yipdw^ |
for Splinder? |
03:50
🔗
|
PatC |
What is |
03:50
🔗
|
PatC |
Splinder? |
03:50
🔗
|
yipdw^ |
http://www.archiveteam.org/index.php?title=Splinder |
03:51
🔗
|
PatC |
How big are the downloads? |
03:52
🔗
|
yipdw^ |
highly variable |
03:52
🔗
|
yipdw^ |
we've gotten most of the 1-10 MB ones already |
03:52
🔗
|
PatC |
ok, i'll give it a shot |
03:52
🔗
|
yipdw^ |
the accounts coming in now are much larger |
03:52
🔗
|
yipdw^ |
we've got until January 31, 2012, though, so don't go crazy |
03:52
🔗
|
SketchCow |
Poor waves |
03:53
🔗
|
yipdw^ |
the last time we had a huge flood of people hitting Splinder, their infrastructure couldn't handle it |
03:53
🔗
|
PatC |
SketchCow, agreed |
03:53
🔗
|
yipdw^ |
and so we ended up with a lot of incompletes |
03:53
🔗
|
SketchCow |
I think Splinder is well at hand. |
03:53
🔗
|
yipdw^ |
yeah, it's good |
03:53
🔗
|
PatC |
yipdw^, ok i'll wait a little |
03:53
🔗
|
SketchCow |
We have plenty of people on it, enough to get it |
03:53
🔗
|
PatC |
ok |
03:55
🔗
|
yipdw^ |
was it actually possible to make a public wave? |
03:55
🔗
|
yipdw^ |
the waves I've got are all restricted to a small group |
03:56
🔗
|
Wyatt|Wor |
Oh, wait, really? |
03:56
🔗
|
PatC |
That was my concern :/ |
03:57
🔗
|
Wyatt|Wor |
I thought Splinder was supposed to go down like...today or something. |
03:57
🔗
|
yipdw^ |
the official announcement is January 31 |
03:59
🔗
|
PatC |
I didn't notice Splinder on the projects wiki page |
03:59
🔗
|
yipdw^ |
oh, speaking of which, I wonder how bad my AWS bill now is |
04:00
🔗
|
yipdw^ |
ooh. |
04:00
🔗
|
yipdw^ |
well it's not as bad as the Friendster days |
04:02
🔗
|
yipdw^ |
actually, I say that most of the small Splinder accounts have been grabbed, but according to http://splinder.heroku.com/ kennethre is still pulling a bunch of them |
04:02
🔗
|
yipdw^ |
weird |
04:02
🔗
|
yipdw^ |
maybe they're all requeues |
04:04
🔗
|
Wyatt|Wor |
Yeah, a couple more people showed up and owned all over my measly 29GB ;) He's really kicking ass. |
04:04
🔗
|
yipdw^ |
he's controlling something like 360 dynos at Heroky |
04:04
🔗
|
yipdw^ |
er, Heroku |
04:04
🔗
|
yipdw^ |
fuck LOIC, that man can nuke sites from orbit single-handedly |
04:05
🔗
|
Wyatt|Wor |
Wow. |
04:05
🔗
|
Wyatt|Wor |
I think I'm up to...150 threads or so? |
04:06
🔗
|
Wyatt|Wor |
(Had to google for what a dyno was, heh) |
04:06
🔗
|
yipdw^ |
oh heh |
04:06
🔗
|
yipdw^ |
he might be doing more, it was all in the #splinder logs at one point |
04:07
🔗
|
yipdw^ |
er, is in the logs |
04:13
🔗
|
SketchCow |
Splinder shifted the date. |
04:13
🔗
|
SketchCow |
That's what's missing. |
04:18
🔗
|
Paradoks |
PatC: I took Splinder out of the "Projects with BASH scripts that need more people running them" when it hit 0 left. Even now, I think it's pretty well covered. Not that it shouldn't be elsewhere on the projects page. |
04:21
🔗
|
PatC |
Ok |
05:08
🔗
|
Wyatt|Wor |
Is there anything out there that is likely to have old DNS info cached or otherwise accessibly saved? |
05:21
🔗
|
Coderjoe |
what info and how old? |
05:25
🔗
|
Wyatt|Wor |
Oh, I'm trying to verify the IP for a site as of...about a month ago? Some reseller on our super legacy Sphera platform went and canceled or something and left one of his clients high and dry. |
05:27
🔗
|
Coderjoe |
i'm not aware of anything that would have data that old |
05:27
🔗
|
Wyatt|Wor |
I'm trying to figure out if we've even got a backup, but none of the usernames she's mentioned exist. And compared to how things are on some servers, we have amazingly good backups for this one. I've checked back to June. |
05:28
🔗
|
Wyatt|Wor |
whois.sc claims to have some historic IP data, but that's $30/month for their stuff. |
05:36
🔗
|
bsmith095 |
one month is old for dns |
05:36
🔗
|
bsmith095 |
wow |
05:36
🔗
|
underscor |
So, you know what would be cool |
05:36
🔗
|
underscor |
Craigslist archive |
05:37
🔗
|
DFJustin |
Wyatt|Wor: have you tried netcraft |
05:37
🔗
|
Paradoks |
Isn't Craigslist actively hostile to pretty much anything that uses Craiglist? |
05:37
🔗
|
Paradoks |
Not that that should stop us, but it would make it difficult, I think. |
05:38
🔗
|
Wyatt|Wor |
DFJustin: Not yet; taking a look now |
05:38
🔗
|
underscor |
Paradoks: WE FUCKIN' ARCHIVED POETRY.COM, GET THAT BLASPHEMY OUTTA HERE |
05:38
🔗
|
underscor |
Just kidding |
05:38
🔗
|
underscor |
But still, I bet we could do it |
05:39
🔗
|
underscor |
I mean, it's not like we're posting |
05:39
🔗
|
dnova |
Paradoks: yes |
05:39
🔗
|
Wyatt|Wor |
That's an interesting moving target. Don't they delete older things? |
05:39
🔗
|
dnova |
they are hostile to anything that uses craigslist |
05:39
🔗
|
dnova |
they go after people who make abstraction layers that make craigslist more useful and I just don't get that |
05:39
🔗
|
Wyatt|Wor |
I tried to put my desk on craigslist. It was surprisingly difficult. |
05:39
🔗
|
bsmith095 |
can i get a copy of poetry.com? is that availible yet? |
05:39
🔗
|
dnova |
they don't insert ads or anything |
05:39
🔗
|
Paradoks |
We archived a small portion of poetry.com, which was totally worth it. And it was totally cool how irritated we made the bastards by attempting to save the information they wanted to destroy. |
05:40
🔗
|
dnova |
they wanted to /sell/ |
05:40
🔗
|
dnova |
the only worth that company had was their user content |
05:40
🔗
|
bsmith095 |
is that archive public? |
05:40
🔗
|
DFJustin |
http://www.archive.org/details/archiveteam-poetrydotcom |
05:41
🔗
|
DFJustin |
this isn't everything we got though |
05:41
🔗
|
bsmith095 |
which was already publically availible anyway, so how is it worth anything to sell it? |
05:41
🔗
|
dnova |
without the content, what are they? |
05:42
🔗
|
bsmith095 |
DFJustin: so where's the rest.. just curious |
05:42
🔗
|
dnova |
a poetry website with no users and nothing to read |
05:42
🔗
|
dnova |
i.e. worthless |
05:42
🔗
|
underscor |
Paradoks: Go ahead and rejoin #Magicallydelicious |
05:42
🔗
|
Paradoks |
bsmith: Theoretically they controlled the publicly available content. |
05:42
🔗
|
DFJustin |
pending sketchcow organization |
05:42
🔗
|
underscor |
(if you want) |
05:42
🔗
|
db48x |
I'm nearly done organizing that |
05:42
🔗
|
DFJustin |
there's some value in an english word domain name I'm sure |
05:42
🔗
|
Paradoks |
Heheh. I mostly just checkin to see if you're still there. I do wonder if delicious will ever get suitably archived, though. |
05:43
🔗
|
underscor |
Coderjoe: Spread some ops around? we're getting low |
05:43
🔗
|
bsmith095 |
ok but theoretically aol owns happy birthday, doesnt mean they can meaningfully enforce that |
05:43
🔗
|
dnova |
they CAN |
05:43
🔗
|
dnova |
nobody in tv or movies can sing it without mpaying, for example |
05:43
🔗
|
dnova |
-m |
05:45
🔗
|
Paradoks |
I like to sing, "Good morning to you" and let people infer what's needed. |
05:46
🔗
|
Paradoks |
bsmith: re: delicious -- It seemed quite likely, once upon a time, that Yahoo was going to shut down delicious and the various user-created lists would be lost. Where user-created content is threatened, there Archive Team will go. |
05:49
🔗
|
dnova |
who here is responsible for the heroku deal |
05:50
🔗
|
dnova |
I don't really understand what heroku is but it looks expensive! |
05:51
🔗
|
underscor |
kennethre is |
05:51
🔗
|
underscor |
But it's free because he's using the "one free dyno" thing |
05:51
🔗
|
dnova |
awesome |
05:51
🔗
|
Paradoks |
dnova: And Alard set up the splinder/anyhub/mobileme.heroku.com tracking things. |
05:52
🔗
|
dnova |
alard is beyond amazing |
05:53
🔗
|
Paradoks |
Agreed. |
05:53
🔗
|
bsmith095 |
im running dld-streamer 40, am i making any kind of a dent at all? |
05:53
🔗
|
Coderjoe |
underscor: spread ops where? |
05:54
🔗
|
dnova |
bsmith095: you can run 200-500 threads on even pretty modest hardware if you are ok with using lots of cpu |
05:54
🔗
|
Wyatt|Wor |
Oh yeah, anyhub. :( looks like we only got about half of it? |
05:55
🔗
|
Paradoks |
bsmith095: You're still getting noticeable quantities of data, though. It's still useful, even if you don't make the top 10 downloaders. |
05:56
🔗
|
dnova |
I wish the heroku thing had stats for all participants |
06:00
🔗
|
dnova |
but yeah, bsmith, another thing is the deadline was extended bigtime so it's not a serious crunch for us now so don't worry too mnuch about going nuts with the # of threads |
06:01
🔗
|
bsmith095 |
just restarted streamer up to 200 and wow this laptop is running great! |
06:01
🔗
|
dnova |
ahhh excellent :D |
06:01
🔗
|
dnova |
I wonder how many kenneth is running!! |
06:01
🔗
|
Wyatt|Wor |
Huh, was there a heroku for mobileme? |
06:02
🔗
|
Wyatt|Wor |
mobileme.h.c isn't working. |
06:02
🔗
|
dnova |
sounds like the person doing it can only have 1 at a time |
06:03
🔗
|
db48x |
Wyatt|Wor: memac.heroku.com |
06:03
🔗
|
Wyatt|Wor |
Ah |
06:04
🔗
|
Wyatt|Wor |
How's their bandwidth? I'll have to get in on that once I consolidate all my Splinder. |
06:05
🔗
|
db48x |
Wyatt|Wor: how is whose bandwidth? |
06:06
🔗
|
dnova |
alard is going to have to dump more claimed but unreturned users into the project again, right? |
06:06
🔗
|
dnova |
probably a few times? |
06:06
🔗
|
Wyatt|Wor |
I guess mobileme? |
06:06
🔗
|
db48x |
I don't think we have seen any indication that mobileme was failing under the load |
06:06
🔗
|
db48x |
but then we hadn't really put out the word yet when splinder came down the line |
06:07
🔗
|
dnova |
yeah this is the first I'm hearing anything about mobileme |
06:07
🔗
|
db48x |
I've got 130 megs of compressed poetry left to sort, out of 1.3 gigs |
06:08
🔗
|
Wyatt|Wor |
Wiki says 200TB by June something...that's a hell of a pull. |
06:08
🔗
|
dnova |
fuck. |
06:08
🔗
|
Wyatt|Wor |
I'm gonna need more space on my VPS. |
06:09
🔗
|
db48x |
heh, yea |
06:09
🔗
|
db48x |
I'm running very low myself |
06:09
🔗
|
db48x |
only 225 gigs left |
06:10
🔗
|
bsmith095 |
even with today's obscenely huge storage capacity, thats still maybe 5 or 6 cubic feet of *really* huge drives |
06:10
🔗
|
dnova |
that's only 50 hard drives |
06:10
🔗
|
dnova |
I don't know if 50 hard drives is 5 cubic feet |
06:10
🔗
|
bsmith095 |
wtf?!?! 50 where do you shop? |
06:11
🔗
|
dnova |
4tb hard drives are around |
06:11
🔗
|
db48x |
hmm |
06:11
🔗
|
db48x |
I've seen 4tb externals, but those were two drives in an enclosure |
06:11
🔗
|
dnova |
there are real actual 4tb hard drives |
06:12
🔗
|
Wyatt|Wor |
Filesystem overhead. Probably closer to 60 if you use exclusively 4TB disks. |
06:12
🔗
|
dnova |
pff ok :P |
06:12
🔗
|
bsmith095 |
*really*, ive heard of 2tb but never actually seen one, but i bought a 1500gb one for $65 on ebay, then it fell off a table, boom, click of death, bought a dollar per mb, to get it back?! |
06:12
🔗
|
Wyatt|Wor |
That and things usually don't fit neatly. |
06:12
🔗
|
bsmith095 |
pardon me, dollar per gig, but still, damn clean rooms |
06:13
🔗
|
dnova |
3tb also exists and has for a while |
06:13
🔗
|
DFJustin |
just got my 2tb luckily before the floods hit |
06:13
🔗
|
Wyatt|Wor |
I have a 2GB on my desk at home. It's at least half for archiveteam stuff |
06:13
🔗
|
bsmith095 |
how do we keep doing this> i mean the tech specs of how theey cram that many little magnetic flux thingies on a spinning platter? |
06:13
🔗
|
DFJustin |
now I need to fill it with mobileme |
06:14
🔗
|
bsmith095 |
Wyatt|Wor: u mean tb right? |
06:14
🔗
|
Wyatt|Wor |
Err, I have a 2TB. I haven't had a 2GB in forever. |
06:15
🔗
|
db48x |
hrm, there's rar file in here |
06:15
🔗
|
bsmith095 |
where and what collection |
06:15
🔗
|
db48x |
in the poetry collection |
06:15
🔗
|
db48x |
I'm unifying all of the archives produced by individual downloaders into a single collection |
06:15
🔗
|
bsmith095 |
really i just wnloaded that |
06:15
🔗
|
db48x |
the next one is a rar file |
06:16
🔗
|
bsmith095 |
db48x: what do u mean unifying, they're only 4 chunks |
06:16
🔗
|
bsmith095 |
3 chunks |
06:16
🔗
|
DFJustin |
that's probably my piece |
06:17
🔗
|
db48x |
bsmith095: there are lots of others that haven't yet been uploaded to archive.org |
06:17
🔗
|
bsmith095 |
u mean the poetry.xom archive |
06:17
🔗
|
bsmith095 |
-x +c |
06:18
🔗
|
db48x |
yes |
06:20
🔗
|
underscor |
http://washingtondc.craigslist.org/nva/sad/2714010065.html |
06:20
🔗
|
underscor |
I really want to apply for that :V |
06:22
🔗
|
dnova |
apply |
06:23
🔗
|
underscor |
I have school haha |
06:24
🔗
|
underscor |
Also, I don't have 7 years of "professional" linux experience |
06:24
🔗
|
underscor |
I mean, I've used it for that long, but |
06:24
🔗
|
dnova |
fuck that' |
06:24
🔗
|
underscor |
haha |
06:24
🔗
|
dnova |
you'd be the best person they interviewed if you were going to do it |
06:34
🔗
|
Wyatt|Wor |
Wow, They're in VA? |
06:34
🔗
|
Wyatt|Wor |
Interesting. |
06:54
🔗
|
db48x |
doh |
06:56
🔗
|
db48x |
my computer locked up :( |
06:56
🔗
|
db48x |
I can't even ssh in |
06:56
🔗
|
db48x |
on the other hand, it is still routing traffic |
07:03
🔗
|
Wyatt|Wor |
An interesting issue. |
07:16
🔗
|
db48x |
I guess I'm going to have to reboot it |
07:17
🔗
|
db48x |
highly annoying |
07:27
🔗
|
db48x |
although it does still have disk activity |
07:28
🔗
|
Wyatt|Wor |
Oh my, now THIS is quite cool http://www.unseen64.net/ |
07:33
🔗
|
db48x |
Wyatt|Wor: cool |
07:51
🔗
|
ersi |
underscor: the "experiance" part is always bullshit |
08:33
🔗
|
Nemo_bis |
a genius switched off power in my house, I have over 30 GiB of incomplete users which I won't be able to complete |
08:33
🔗
|
Nemo_bis |
what should I do? |
08:34
🔗
|
chronomex |
hibernate your computer |
08:36
🔗
|
Nemo_bis |
uh? |
08:36
🔗
|
Nemo_bis |
It was switched off and now rebooted |
08:36
🔗
|
chronomex |
oh |
08:36
🔗
|
chronomex |
hmmm. |
08:37
🔗
|
ersi |
Yikes |
08:37
🔗
|
chronomex |
I thought you were on UPS or something |
08:38
🔗
|
* |
Nemo_bis hates power company that switches off your power when you go over 3 kW without any warning :-X |
08:38
🔗
|
chronomex |
that's fucked, man |
08:38
🔗
|
Nemo_bis |
no, that would be supposed to be unnede |
08:38
🔗
|
chronomex |
fucked. |
08:38
🔗
|
ersi |
lol, what?! |
08:38
🔗
|
chronomex |
3000W it kicks a breaker? |
08:38
🔗
|
ersi |
So they.. kill your power.. because you're using it? Or have unpaid bills? |
08:38
🔗
|
chronomex |
that's two heaters! |
08:41
🔗
|
Nemo_bis |
no, just because you go over your power quota |
08:41
🔗
|
Nemo_bis |
4.5 kW costs more |
08:41
🔗
|
chronomex |
where is this? I assume you live on an island or something |
08:42
🔗
|
chronomex |
or in australia where utilities seem to suck universally |
08:42
🔗
|
Nemo_bis |
Italy |
08:43
🔗
|
Nemo_bis |
it's a long story, starting in 1960s with the nationalization of energy companies |
08:43
🔗
|
chronomex |
aha. |
08:44
🔗
|
chronomex |
where I live they charge by the kwh on an increasing scale and go after you eventually if you become unreasonable |
08:44
🔗
|
Nemo_bis |
the same here |
08:44
🔗
|
Nemo_bis |
but this is another issue |
08:44
🔗
|
chronomex |
three fucking kilowatts |
08:45
🔗
|
Nemo_bis |
it's a good thing, in general, but if only they gave you one or two beeps to warn you |
08:45
🔗
|
Nemo_bis |
and it was worse before, the old mechanical counters were stricter |
08:45
🔗
|
Cameron_D |
Only the internet sucks here is Australia |
08:47
🔗
|
yipdw^ |
Nemo_bis: give alard the output of check-dld.sh |
08:47
🔗
|
yipdw^ |
and just finish up as much as you can |
08:47
🔗
|
Nemo_bis |
yipdw^, ok |
08:47
🔗
|
Nemo_bis |
I'd tar the incomplete users and upload them to batcave, makes sense? |
08:47
🔗
|
yipdw^ |
safest thing to do is to just requeue the incompletes |
08:47
🔗
|
yipdw^ |
nah |
08:48
🔗
|
Nemo_bis |
I don't want to just delete them |
08:48
🔗
|
yipdw^ |
well, I don't think it makes sense -- we have until January 31 to get them |
08:48
🔗
|
yipdw^ |
it's just that everything else on batcave is stuff that, until we get a closer look at it, is presumably complete |
08:49
🔗
|
yipdw^ |
uploading known incomplete data throws a wrench into that |
08:49
🔗
|
Nemo_bis |
I'd put them in a different directory and in a different format |
08:50
🔗
|
yipdw^ |
I guess it's fine if it's labeled |
08:50
🔗
|
yipdw^ |
just make sure SketchCow knows |
08:50
🔗
|
ersi |
I think I get the same price per kwH.. unlimited.. |
08:51
🔗
|
ersi |
I mean, we have a contract which specifies a price per kilowatt hour.. |
08:51
🔗
|
chronomex |
that's reasonable |
08:52
🔗
|
chronomex |
here there's a really low tier for like 100kwh, then it jumps by like 2x for the next few thousand kwh, then it kind of plateaus slightly lower |
08:52
🔗
|
chronomex |
the jump in the middle is to pay for the cost of residential service |
08:52
🔗
|
ersi |
I bet if I consume lots.. I'd get a discounted rate :) |
08:53
🔗
|
chronomex |
yea but you'd have to pay for all that power |
08:53
🔗
|
ersi |
I remember talking to some dude in Berlin, Germany - who had an industrial power contract to his flat.. Hehe |
08:53
🔗
|
ersi |
Then again.. he had a cluster of machines in his flat :p |
09:58
🔗
|
RedType |
ersi: germans take their porn viewing seriously |
10:13
🔗
|
ndurner_w |
is batcave down? |
10:15
🔗
|
Wyatt|Wor |
How many threads does that kenneth guy have going again? I'm close to 300 and nowhere NEAR his rate. |
10:18
🔗
|
ndurner_w |
ah, works again |
10:57
🔗
|
Nemo_bis |
Wyatt|Wor, +1 |
10:58
🔗
|
Wyatt|Wor |
Nemo_bis: ? |
10:58
🔗
|
Nemo_bis |
Wyatt|Wor, about kenneth |
10:58
🔗
|
Wyatt|Wor |
Ah |
10:58
🔗
|
Nemo_bis |
the good thing is, I don't need to do this job any longer if he goes on like this |
10:59
🔗
|
Nemo_bis |
alard, this is the situation of incomplete users (+ about 5000 us users whose errors were not detected due to the locale problem when the site went down): http://p.defau.lt/?yd3MFeDi91WK6IU007B11A |
10:59
🔗
|
db48x |
Nemo_bis: you can use dld-streamer.sh to start fixing those |
11:00
🔗
|
Nemo_bis |
also, alard (and others), FYI, this is how heavy they are: http://p.defau.lt/?fCQq9XenY5_0s3yBtHse_Q -> 58207 MiB total |
11:00
🔗
|
db48x |
find -name '.incomplete' | cut -d '/' -f 3,7 | tr '/' ':' >incompletes |
11:00
🔗
|
Nemo_bis |
no, I won't |
11:00
🔗
|
db48x |
./dld-streamer.sh <nick> <threads> incompletes |
11:01
🔗
|
Nemo_bis |
I don't have disk space enough, they were killing my memory and someone else with better tools should do such monster users apparently |
11:01
🔗
|
db48x |
ah |
11:01
🔗
|
db48x |
in that case |
11:01
🔗
|
Nemo_bis |
they've been running for ten days in some cases |
11:01
🔗
|
db48x |
build your list of usernames as above, then go to splinder.heroku.com/rescue-me :) |
11:02
🔗
|
Nemo_bis |
db48x, what does it do? |
11:02
🔗
|
Nemo_bis |
re-add to tracker? |
11:02
🔗
|
alard |
Hi all. |
11:02
🔗
|
Nemo_bis |
hello |
11:02
🔗
|
db48x |
yea, lets you add them back into the tracker |
11:02
🔗
|
Nemo_bis |
ok |
11:02
🔗
|
db48x |
they probably already were, but that way you'll know for sure |
11:02
🔗
|
alard |
Actually, rescue-me doesn't re-add things. |
11:03
🔗
|
db48x |
oh? |
11:03
🔗
|
Nemo_bis |
so, alard, can you put those users back in the queue? |
11:03
🔗
|
alard |
It's for adding unknown usernames to the tracker. |
11:03
🔗
|
db48x |
ah |
11:03
🔗
|
Nemo_bis |
(if they are not already, because you mentioned putting back 2+ days old users, and mine were 10 days old) |
11:03
🔗
|
alard |
(rescue-me is useful for mobileme, where people say 'could you save user X'?) |
11:04
🔗
|
alard |
Nemo_bis: As long as your client hasn't marked them done they will be added back to the queue. |
11:04
🔗
|
Nemo_bis |
alard, those 5000 us users have been marked done |
11:05
🔗
|
alard |
Ah, I see. There may be more of those, so we'll have to do some checking later on. (Still time until January, right?) |
11:05
🔗
|
alard |
I'll add your list back to the todo list. |
11:08
🔗
|
Nemo_bis |
does rsync want input list of files with null or newlines? |
11:09
🔗
|
alard |
Newlines, surely? |
11:10
🔗
|
alard |
Ah, you can configure it: -0, --from0 |
11:10
🔗
|
alard |
CR+LF |
11:10
🔗
|
alard |
This tells rsync that the rules/filenames it reads from a file |
11:10
🔗
|
alard |
are terminated by a null (’\0’) character, not a NL, CR, or |
11:10
🔗
|
Nemo_bis |
ok |
11:12
🔗
|
alard |
Also: I've been playing with a new wget project yesterday, see http://www.archiveteam.org/index.php?title=Wget_with_Lua_hooks . If you have any suggestions or comments, please add them to the wiki. |
11:12
🔗
|
Nemo_bis |
SketchCow, I'm uploading those 58 GiB of incomplete users to my splinder-broken directory in case someone can use them in some way (10 days of downloads, argh) |
11:13
🔗
|
Nemo_bis |
only 250 KiB/s now :-/ |
11:13
🔗
|
Nemo_bis |
see you all |
11:17
🔗
|
Wyatt|Wor |
All right, shift over. I'll let these run and catch everyone on the other side! |
12:18
🔗
|
Soojin |
thearchiveteam by proxy http://englishrussia.com/2011/11/25/things-that-must-not-be-forgotten/ |
12:25
🔗
|
db48x |
Soojin: cool |
12:31
🔗
|
NotGLaDOS |
wait, are there any unassigned users? |
12:34
🔗
|
db48x |
NotGLaDOS: yes, but not many |
12:34
🔗
|
NotGLaDOS |
drat. |
12:34
🔗
|
db48x |
heh |
12:34
🔗
|
db48x |
there are 13000 left |
12:35
🔗
|
db48x |
http://splinder.heroku.com/ |
12:35
🔗
|
db48x |
they might last another hour or two |
12:37
🔗
|
NotGLaDOS |
Well, I can always help! |
12:37
🔗
|
db48x |
the more the merrier :) |
12:40
🔗
|
NotGLaDOS |
Plus, the server's in Romania, so it should help with latency or somethi- oh, wait, only Australians have a cable running along the seafloor |
12:40
🔗
|
NotGLaDOS |
Nevermind! |
12:43
🔗
|
NotGLaDOS |
I shall do... 10 |
12:43
🔗
|
NotGLaDOS |
10 concurrent sessions |
12:46
🔗
|
NotGLaDOS |
downloads* |
12:47
🔗
|
Cameron_D |
only 10? |
12:48
🔗
|
NotGLaDOS |
fine |
12:48
🔗
|
NotGLaDOS |
I'll do 1000 |
12:49
🔗
|
* |
NotGLaDOS winds down script |
12:49
🔗
|
NotGLaDOS |
"NotGLaDOS it:maurizio71" |
12:49
🔗
|
NotGLaDOS |
\o/ |
12:50
🔗
|
NotGLaDOS |
...I got a 1GB user, didn't I |
12:50
🔗
|
db48x |
if it said 1000MB, then yes |
12:51
🔗
|
db48x |
most of them aren't that large though |
12:51
🔗
|
NotGLaDOS |
nope, it choked on a 0MB user. |
12:51
🔗
|
* |
NotGLaDOS is not amused |
12:51
🔗
|
db48x |
what do you mean by choked? |
12:51
🔗
|
NotGLaDOS |
as in, decided to take as long as it wanted to |
12:51
🔗
|
db48x |
oh, yea |
12:52
🔗
|
db48x |
the server isn't very fast |
12:52
🔗
|
NotGLaDOS |
It shouldn't do that, as I'm not tunnelling through that server |
12:52
🔗
|
NotGLaDOS |
That's what I have Cameron_D for. |
12:52
🔗
|
db48x |
and even the users with the least amount of data require several http connections |
12:52
🔗
|
NotGLaDOS |
ah. |
12:53
🔗
|
NotGLaDOS |
well, once this winds down, screen -dmS splinder ./dld-streamer.sh NotGLaDOS 1000 |
12:53
🔗
|
NotGLaDOS |
While I wait, canabalt time! |
12:54
🔗
|
db48x |
heh |
12:54
🔗
|
db48x |
1000 is pretty optimistic |
12:54
🔗
|
NotGLaDOS |
I'll probably starve my ZNC users. |
12:54
🔗
|
NotGLaDOS |
Oh well! |
12:55
🔗
|
db48x |
indeed :) |
12:55
🔗
|
NotGLaDOS |
IT'S IN THE NAME OF ARCHIVING! |
12:55
🔗
|
db48x |
and that's the best excuse there is |
12:56
🔗
|
NotGLaDOS |
Indeed! |
12:56
🔗
|
db48x |
mmm |
12:56
🔗
|
db48x |
this disk is 97% |
12:56
🔗
|
NotGLaDOS |
4 to go until it's finished winding down. |
12:58
🔗
|
db48x |
probably time for me to wind down as well |
13:00
🔗
|
NotGLaDOS |
wait, is it doing them simultaneous- of course it is. |
13:01
🔗
|
db48x |
yea |
13:02
🔗
|
db48x |
hrm |
13:02
🔗
|
db48x |
oh dear |
13:02
🔗
|
NotGLaDOS |
I just get the feeling that the tracker has dumped a large one in as the last one for fun. |
13:02
🔗
|
NotGLaDOS |
hm? |
13:02
🔗
|
db48x |
I have 56GB free, and although I just stopped my downloaders, I have 100 threads left winding down |
13:03
🔗
|
db48x |
I'm doing the mobileme project, and those users are larger |
13:03
🔗
|
NotGLaDOS |
crap |
13:03
🔗
|
db48x |
estimated size is 63GB |
13:03
🔗
|
NotGLaDOS |
yer screwed. |
13:03
🔗
|
db48x |
yep |
13:03
🔗
|
NotGLaDOS |
"it:habbo" |
13:03
🔗
|
NotGLaDOS |
I feel sorry for that guy |
13:04
🔗
|
db48x |
heh |
13:04
🔗
|
db48x |
there have been a lot of weird usernames |
13:18
🔗
|
NotGLaDOS |
Mmm, popcorn |
13:19
🔗
|
NotGLaDOS |
Crap, this is going to finish spinning down by the time they're all gone, aren't they? |
13:19
🔗
|
* |
NotGLaDOS shakes fist at kenneth |
13:19
🔗
|
db48x |
heh |
13:20
🔗
|
db48x |
just run another one |
13:20
🔗
|
db48x |
in another terminal |
13:20
🔗
|
NotGLaDOS |
..that'll work? |
13:20
🔗
|
NotGLaDOS |
ooh! |
13:20
🔗
|
db48x |
they won't get in each other's way |
13:20
🔗
|
* |
NotGLaDOS uses screen anyway |
13:20
🔗
|
db48x |
indeed :) |
13:21
🔗
|
db48x |
hmm |
13:21
🔗
|
NotGLaDOS |
time to do my 1000 connection dream, and knock myself off of anything that doesn't go through this HTTP proxy! |
13:21
🔗
|
NotGLaDOS |
bye, Cameron_D! |
13:22
🔗
|
db48x |
I have 1000GB of friendster data |
13:22
🔗
|
NotGLaDOS |
"downloading it:chinachina" |
13:22
🔗
|
NotGLaDOS |
chinachinachinachinachinachinachinachina |
13:22
🔗
|
NotGLaDOS |
db48x: nice |
13:23
🔗
|
db48x |
hrm |
13:23
🔗
|
NotGLaDOS |
Welp, there goes my dreams |
13:23
🔗
|
db48x |
995GB of it is already compressed though |
13:23
🔗
|
NotGLaDOS |
"Cannot allocate memory" |
13:23
🔗
|
NotGLaDOS |
"TERMINATE ALL THE SELFS" |
13:23
🔗
|
db48x |
that is a lot of wgets |
13:24
🔗
|
NotGLaDOS |
...right, forgot about that |
13:24
🔗
|
NotGLaDOS |
maybe 100? |
13:24
🔗
|
NotGLaDOS |
it got up to 170 before derping |
13:24
🔗
|
db48x |
100 is a good start |
13:24
🔗
|
db48x |
check memory usage, iowait and bandwidth and then adjust |
13:25
🔗
|
NotGLaDOS |
...wait, did that just allocate 170 users to me that will never complete? |
13:25
🔗
|
db48x |
yes and no |
13:25
🔗
|
db48x |
at some point we'll add them back to the queue |
13:25
🔗
|
NotGLaDOS |
Oh, phew. |
13:25
🔗
|
db48x |
or you can collect the list (find -name '.incomplete' | cut -d '/' -f 3,7 | tr '/' ':' >incompletes) and run it through dld-streamer |
13:25
🔗
|
NotGLaDOS |
"180GB traffic" |
13:26
🔗
|
NotGLaDOS |
It was at 90 this morning! |
13:26
🔗
|
NotGLaDOS |
\o/ |
13:26
🔗
|
db48x |
:) |
13:26
🔗
|
NotGLaDOS |
Wait I haven't archived that much. |
13:28
🔗
|
NotGLaDOS |
Time to check my TCP buffers! |
13:28
🔗
|
NotGLaDOS |
Actually not that bad. |
13:28
🔗
|
NotGLaDOS |
bandwidth, however, just gets a 2Gigabit spike randomly \o/ |
13:29
🔗
|
NotGLaDOS |
hehe, kernel memory just jumps to 70M |
13:34
🔗
|
Nemo_bis |
2 Gb? are splinder servers so robust? |
13:35
🔗
|
NotGLaDOS |
No idea |
13:35
🔗
|
NotGLaDOS |
They had over 1.2 billion users, they would've had to handle that traffic. |
13:36
🔗
|
NotGLaDOS |
Oh, I can just hear my server screaming for mercy. |
13:36
🔗
|
db48x |
no, 1.3 million |
13:36
🔗
|
db48x |
only off by three orders of magnitude |
13:36
🔗
|
NotGLaDOS |
drat. |
13:37
🔗
|
NotGLaDOS |
>MFW it's only number 99 and 100 doing the work |
13:37
🔗
|
NotGLaDOS |
Oh well, back to canabalt |
13:39
🔗
|
db48x |
awesome, I got all the poems sorted |
13:40
🔗
|
NotGLaDOS |
Nice |
13:41
🔗
|
db48x |
there are two tarballs left |
13:41
🔗
|
db48x |
one contains their blog |
13:42
🔗
|
db48x |
one contains a categorization of the poems |
13:48
🔗
|
db48x |
oh, and a third that probably is a mixture of files from the site and poems, hrm |
13:52
🔗
|
NotGLaDOS |
"PID 5766 finished 'it:Barbabietole_Azzurre': Error - exited with status 6." |
13:52
🔗
|
NotGLaDOS |
My first error! |
13:53
🔗
|
* |
NotGLaDOS feels special |
13:54
🔗
|
db48x |
:) |
13:54
🔗
|
NotGLaDOS |
And another one! |
14:03
🔗
|
NotGLaDOS |
Crap. |
14:03
🔗
|
NotGLaDOS |
I'm starting to get a lot of error 6 |
14:11
🔗
|
Wyatt |
Down to 1500 (for now) |
14:12
🔗
|
NotGLaDOS |
It'll go up. |
14:12
🔗
|
Wyatt |
Oh I know |
14:12
🔗
|
NotGLaDOS |
Script crashed when I set limit to 1000 |
14:12
🔗
|
NotGLaDOS |
There's about 180 users not done |
14:12
🔗
|
NotGLaDOS |
fact: I am a moron |
14:14
🔗
|
db48x |
25 GB free |
14:14
🔗
|
Wyatt |
So here's a question: are the mobileme downloads larger in terms of filesize? I have a strong feeling that saving large files are going to be where a beefy datacentre connection will really shine. |
14:15
🔗
|
db48x |
Wyatt: much bigger |
14:15
🔗
|
db48x |
Wyatt: the current average is 650 MB/user |
14:16
🔗
|
db48x |
http://memac.heroku.com/ |
14:16
🔗
|
db48x |
down from 652 MB/user an hour or two ago |
14:16
🔗
|
Wyatt |
No, I mean, are individual files going to be larger? |
14:17
🔗
|
db48x |
yea, it was a file syncing service, not a weblog host |
14:17
🔗
|
Wyatt |
Or is it going to be a large number of small files (a lot of http requests; less benefit from fat pipes) |
14:17
🔗
|
Wyatt |
Ahh |
14:17
🔗
|
Wyatt |
Okay, neat |
14:17
🔗
|
NotGLaDOS |
There should be passwords in there.. |
14:17
🔗
|
NotGLaDOS |
NO, DONT THINK LIKE THAT. |
14:18
🔗
|
NotGLaDOS |
I'm going to wind my script down. |
14:18
🔗
|
* |
db48x yawns |
14:18
🔗
|
db48x |
yea, we're starting to overload it |
14:19
🔗
|
NotGLaDOS |
two status 6s in a row! |
14:20
🔗
|
db48x |
I need to cough up for a new zfs array |
14:22
🔗
|
NotGLaDOS |
200 to go! |
14:22
🔗
|
db48x |
then I can have one for the archives and one for my own stuff |
14:22
🔗
|
NotGLaDOS |
Then we can re-add! |
14:23
🔗
|
NotGLaDOS |
Soon, kenneth will run out of users, so I'll be all over it |
14:23
🔗
|
NotGLaDOS |
MUAHAHAHA |
14:23
🔗
|
db48x |
:) |
14:23
🔗
|
NotGLaDOS |
wait, status 5? |
14:23
🔗
|
NotGLaDOS |
[2011-11-25 14:24:09+00:00] 84/100 PID 15466 finished 'it:Spazio_ai_Giovani': Error - exited with status 5. |
14:24
🔗
|
NotGLaDOS |
Now it's just spitting status 6s and 5s at me |
14:24
🔗
|
db48x |
the proxies are dying |
14:24
🔗
|
NotGLaDOS |
ah |
14:25
🔗
|
db48x |
happened a couple of days ago too |
14:28
🔗
|
NotGLaDOS |
...so we're doing negative users now.. |
15:09
🔗
|
* |
db48x yawns |
15:09
🔗
|
db48x |
well, I must sleep |
15:09
🔗
|
db48x |
happy archving |
15:12
🔗
|
NotGLaDOS |
o/ |
15:12
🔗
|
NotGLaDOS |
10 processes to wind down, then sleeeeeep |
15:12
🔗
|
* |
PatC looks at the clock |
15:12
🔗
|
PatC |
*1013* |
15:13
🔗
|
NotGLaDOS |
2312 here. |
15:13
🔗
|
PatC |
ahh |
15:13
🔗
|
PatC |
Aussy? |
15:15
🔗
|
NotGLaDOS |
West Australian. |
15:15
🔗
|
PatC |
Cool! |
15:37
🔗
|
Schbirid |
meh, it:vanillaaa finish already! |
16:20
🔗
|
dnova |
mornin |
16:26
🔗
|
PatC |
Mornin' |
17:01
🔗
|
underscor |
alard: That looks freakin' awesome |
17:03
🔗
|
underscor |
(the lua hooks) |
17:25
🔗
|
DFJustin |
here's the site for that russian guy with the phonographs, I think http://staroeradio.ru/collection |
21:54
🔗
|
underscor |
http://metaception.com/pepper |
22:26
🔗
|
Wyatt |
dld-streamer automatically retries incompletes? Is that what I was told yesterday? |
22:28
🔗
|
Coderjoe |
automatically? not unless someone added that since wed. |
22:29
🔗
|
Coderjoe |
dld-single retries 5 times |
22:29
🔗
|
Wyatt |
Ah, okay |
22:30
🔗
|
Wyatt |
So do that find thing |
22:31
🔗
|
Coderjoe |
dld-streamer as of wednesday has an optional parameter to provide a list of items to fetch (rather than ask the tracker) |
22:35
🔗
|
godane |
i found something called js-wikireader |
22:36
🔗
|
godane |
https://github.com/antimatter15/js-wikireader |
22:39
🔗
|
underscor |
godane: I went to governor's school with him! |
22:39
🔗
|
underscor |
haha |
22:43
🔗
|
godane |
http://www.youtube.com/watch?v=e3KIyXuZJGY |
22:51
🔗
|
winr4r |
SketchCow: in which i learn that you are awesome, teresa has been a friend of mine for some years, we were talking about some things, i pointed her to a speech of yours in which you mentioned geocities, and she was like "hey i had some stuff on geocities" and i was like "yeah, talk to jason, he might have it" |
22:52
🔗
|
winr4r |
YOU HAD IT |
22:54
🔗
|
winr4r |
SketchCow: i also might be off work for a couple of weeks again, so if there's stuff that needs describing, i might be your man |
23:19
🔗
|
closure |
back from t-giving.. have splinder users still downloading, incredible |
23:20
🔗
|
yipdw^ |
a lot of them were injected back into the todo queue |
23:21
🔗
|
yipdw^ |
i still have downloads still going on too, though, which is nuts |
23:22
🔗
|
yipdw^ |
-rw-rw-r-- 1 ec2-user ec2-user 110807580 Nov 25 23:22 splinder.com-Rei-chan-blog-touchingthestars.splinder.com.warc.gz |
23:22
🔗
|
yipdw^ |
/home/ec2-user/splinder-grab/data/it/R/Re/Rei/Rei-chan |
23:22
🔗
|
yipdw^ |
[ec2-user@ip-10-80-146-172 Rei-chan]$ ls -l *warc* |
23:22
🔗
|
yipdw^ |
[ec2-user@ip-10-80-146-172 Rei-chan]$ pwd |
23:24
🔗
|
Nemo_bis |
heh, mine lasted up to ten days (and counting) |
23:24
🔗
|
Coderjoe |
man... I started the ec2 instance thinking "oh, this will only be 5 days at most..." |
23:24
🔗
|
yipdw^ |
er, wait |
23:24
🔗
|
yipdw^ |
Rei-chan is actually a busted account |
23:24
🔗
|
yipdw^ |
check this out: http://www.splinder.com/myblog/comment/list/4212591/48159251?from=400 |
23:25
🔗
|
Coderjoe |
before splinder extended their closure date |
23:25
🔗
|
yipdw^ |
try to click any of the navigation links |
23:25
🔗
|
yipdw^ |
you will be sent to the same page |
23:25
🔗
|
yipdw^ |
wtf |
23:25
🔗
|
Coderjoe |
yay. spidertraps. |
23:25
🔗
|
yipdw^ |
is it? |
23:25
🔗
|
yipdw^ |
the account does have some legitimate content in it |
23:25
🔗
|
Coderjoe |
I haven't looked yet |
23:25
🔗
|
yipdw^ |
or something that looks human-generated |
23:26
🔗
|
Coderjoe |
i didn't mean an intentional trap |
23:26
🔗
|
yipdw^ |
oh |
23:26
🔗
|
Coderjoe |
friendster had a lot of accidental shit that created spider traps |
23:26
🔗
|
yipdw^ |
alard: I think we need a way of flagging accounts as "cannot archive fully" or some such; see http://www.splinder.com/myblog/comment/list/4212591/48159251 and click the navigation links for an example |
23:27
🔗
|
Coderjoe |
server on fire? |
23:27
🔗
|
yipdw^ |
well |
23:27
🔗
|
yipdw^ |
alard: come to think of it, maybe we don't, because iirc wget doesn't try to retrieve URLs it's already seen |
23:27
🔗
|
yipdw^ |
or does it |
23:28
🔗
|
Coderjoe |
it shouldn't |
23:28
🔗
|
yipdw^ |
I mean, it shouldn't, assuming that it assumes GET is idempotent |
23:28
🔗
|
Coderjoe |
and it shouldn't go offsite, either |
23:28
🔗
|
yipdw^ |
so this should complete at some point, it'll just be a fucking long grab |
23:28
🔗
|
Coderjoe |
which of those links goes to another splinder site? |
23:28
🔗
|
yipdw^ |
the spam links? |
23:28
🔗
|
yipdw^ |
I don't know |
23:28
🔗
|
yipdw^ |
none, as far as I can tell |
23:29
🔗
|
Coderjoe |
haha |
23:29
🔗
|
Coderjoe |
"penis van lesbian" |
23:29
🔗
|
Coderjoe |
is that like an ice cream truck? |
23:29
🔗
|
yipdw^ |
I was thinking Dick van Patten |
23:30
🔗
|
yipdw^ |
oh fuck |
23:30
🔗
|
yipdw^ |
2011-11-25 23:28:42 URL:http://www.splinder.com/splinder_noconn.html [1402/1402] -> "./tmpfs/it/Rei-chan/www.splinder.com/splinder_noconn.html" [1] |
23:30
🔗
|
yipdw^ |
I hope that doesn't mean I missed something |
23:30
🔗
|
Coderjoe |
... |
23:30
🔗
|
Coderjoe |
noconn? great |
23:31
🔗
|
yipdw^ |
yeah |
23:31
🔗
|
Coderjoe |
is that an overload error from a reverse proxy gateway? |
23:31
🔗
|
yipdw^ |
it's a maintenance page |
23:31
🔗
|
Coderjoe |
fuck on a stick |
23:31
🔗
|
yipdw^ |
fuck HTTP status codes, we're doing this web style |
23:32
🔗
|
yipdw^ |
I don't know what it is |
23:32
🔗
|
yipdw^ |
but I just saw it in the Rei-chan wget log |
23:32
🔗
|
Coderjoe |
that page looks simlar to the US page |
23:32
🔗
|
yipdw^ |
checking others |
23:33
🔗
|
Coderjoe |
doing a massive rgrep |
23:34
🔗
|
Coderjoe |
er wait |
23:34
🔗
|
Coderjoe |
I just want the logs |
23:34
🔗
|
Coderjoe |
i r smrt |
23:34
🔗
|
yipdw^ |
well done |
23:34
🔗
|
yipdw^ |
ugh, this isn't looking good on my end |
23:35
🔗
|
yipdw^ |
https://gist.github.com/a15c7707ee666502a825 |
23:38
🔗
|
Coderjoe |
looking quite bad here, too |
23:39
🔗
|
Coderjoe |
hmm |
23:39
🔗
|
Coderjoe |
not so bad for me it seems |
23:40
🔗
|
Coderjoe |
https://gist.github.com/0427b4ed12ae48f2fb5f |
23:40
🔗
|
Coderjoe |
at home. let's check the ec2 |
23:41
🔗
|
yipdw^ |
Coderjoe: when did they start happening |
23:41
🔗
|
* |
Nemo_bis has some as well :-( |
23:42
🔗
|
closure |
WyattL yo, around? |
23:42
🔗
|
closure |
Wyatt: yo, around? |
23:45
🔗
|
Wyatt|Wor |
closure: Yeah? |
23:57
🔗
|
Coderjoe |
any idea why the check/fix scripts only check the us profiles for 502/504? |
23:58
🔗
|
Coderjoe |
well, 500 errors, not just 502/504 |