Time |
Nickname |
Message |
00:04
🔗
|
underscor |
wget-warc is compiling |
00:10
🔗
|
underscor |
- Downloading profile HTML pages... |
00:10
🔗
|
underscor |
0 12:10AM:abuie@abuie-dev:/2/splinder-grab 11082 Ï ./dld-profile.sh lowvoice |
00:10
🔗
|
underscor |
Downloading lowvoice profile |
00:10
🔗
|
underscor |
:) |
00:10
🔗
|
underscor |
alard: Looks like it's running |
00:10
🔗
|
underscor |
Now it's Parsing |
00:10
🔗
|
underscor |
and downloading media |
00:10
🔗
|
underscor |
now it's doing the blog |
00:17
🔗
|
* |
db48x sighs |
00:17
🔗
|
db48x |
I wasn't running mobileme-grab in screen |
00:18
🔗
|
yipdw |
hmm |
00:18
🔗
|
yipdw |
I guess splinder is higher-priority |
00:18
🔗
|
yipdw |
than mobileme |
00:19
🔗
|
db48x |
dying sooner? |
00:19
🔗
|
yipdw |
24th |
00:19
🔗
|
db48x |
14 days? sheesh |
00:20
🔗
|
db48x |
hrm |
00:20
🔗
|
db48x |
how much space do you estimate it will require? |
00:22
🔗
|
Paradoks |
I tried the Splinder script, and it didn't give me any errors in doing so. |
00:23
🔗
|
Paradoks |
I'm not quite sure how to usefully check any further than that, though. |
00:27
🔗
|
underscor |
alard: - Result: 1.1M |
00:27
🔗
|
underscor |
alard: Works fine |
00:30
🔗
|
underscor |
Oh, neat, I broke a TB! |
00:34
🔗
|
yipdw |
ha |
00:34
🔗
|
yipdw |
it just occurred to me that "man which" is a funny command |
00:35
🔗
|
underscor |
hahah |
00:39
🔗
|
yipdw |
alard: I made some du abstractions for OS X in the "osx" branch |
00:49
🔗
|
yipdw |
alard: though, the script currently doesn't work with profiles like http://www.us.splinder.com/users/list/...Alf...@tipic.com |
00:51
🔗
|
yipdw |
actually that's a really annoying one, especially insofar as hierarchical organization is concerned, because . has special meaning |
00:51
🔗
|
underscor |
Yeah, that's a bummer |
00:51
🔗
|
yipdw |
maybe URL-encode the profile names |
00:52
🔗
|
underscor |
We'll have to figure out something for it |
00:52
🔗
|
underscor |
Oh no, yipdw is almost beating me |
00:52
🔗
|
yipdw |
wait, no, URL-encoding allows '.' |
00:52
🔗
|
yipdw |
underscor: at what, mobileme |
00:52
🔗
|
yipdw |
I don't think so |
00:52
🔗
|
yipdw |
:P |
00:54
🔗
|
underscor |
:D |
01:02
🔗
|
underscor |
fuck yes |
01:02
🔗
|
underscor |
passed government this quarter |
01:02
🔗
|
underscor |
\\\\\\o/////// |
01:02
🔗
|
yipdw |
awesome |
01:02
🔗
|
yipdw |
Rick Perry didn't |
01:02
🔗
|
underscor |
lol |
01:04
🔗
|
underscor |
Comp Gov: D, AP Psych: C+, AP Stat: B, Database Design and Engineering: A, Network Design and Engineering: A, Gifted Education Multidisciplinary Seminar: A, Physics I: C+ |
01:04
🔗
|
underscor |
Not too shabby |
01:06
🔗
|
Zebranky |
Shame, not AP US Gov? :D |
01:10
🔗
|
underscor |
It's AP Comparative |
01:10
🔗
|
underscor |
Everything except the US ;D |
01:11
🔗
|
Zebranky |
Oh, good enough! |
01:11
🔗
|
Zebranky |
Going to ace the exam, of course? |
01:13
🔗
|
underscor |
Hah, hopefully |
01:13
🔗
|
underscor |
I love the class |
01:13
🔗
|
underscor |
I love all the classes |
01:13
🔗
|
underscor |
I hate the scutwork |
01:19
🔗
|
underscor |
broke 1.3TB! |
01:21
🔗
|
Zebranky |
I'd like to take this moment to say |
01:21
🔗
|
Zebranky |
DCC SEND startkeylogger 0 0 0 |
01:21
🔗
|
Zebranky |
Aww, no one dropped :( |
01:21
🔗
|
Zebranky |
I actually found someone last night who's still vulnerable to that |
01:22
🔗
|
underscor |
What did it do previously? |
01:24
🔗
|
Zebranky |
A number of old routers and firewalls would immediately drop all connections on detecting that string |
01:25
🔗
|
Zebranky |
Because, as you might guess, it was used to start a keylogger at some point |
01:25
🔗
|
underscor |
ha |
01:26
🔗
|
Zebranky |
I said it as a joke, because someone complained about someone else messing up his client with an excessively long nick |
01:26
🔗
|
Zebranky |
And a third party peered out immediately |
01:26
🔗
|
Zebranky |
And I shat a brick |
01:26
🔗
|
underscor |
haha |
01:37
🔗
|
underscor |
alard: Are we gonna do something similar for splinder? |
01:37
🔗
|
underscor |
(you could probably reuse the tracker from mobileme :D) |
01:38
🔗
|
db48x |
yea, it seems like it should be pretty generic |
01:42
🔗
|
yipdw |
it looks like we can download all the users for splinder and chunk it up |
01:46
🔗
|
SketchCow |
Hey |
01:46
🔗
|
SketchCow |
So, listen. |
01:46
🔗
|
SketchCow |
Archive.org's going to have a bit of a space storage shortage (by their standards) for a while. |
01:46
🔗
|
SketchCow |
Because of the Thailand situations |
01:46
🔗
|
SketchCow |
So we should plan and refine MobileMe, but adding all 200tb is going to be slightly delayed. |
01:47
🔗
|
SketchCow |
Slightly. |
01:47
🔗
|
SketchCow |
So it would be good for us to discuss and refine the download process, and more importantly the packing-up process. |
01:48
🔗
|
underscor |
Slightly, x3 |
01:48
🔗
|
underscor |
Well, I mean, the download process is top notch in its current state, I think. |
01:48
🔗
|
underscor |
We get everything possible, with no duping |
01:49
🔗
|
underscor |
As far as packing, alard's written a script that rsyncs the completed users to batcave, but that does no compression |
01:49
🔗
|
underscor |
(in this case, however, I don't know how useful compression will be; mostly pictures and video things that don't compress well, from what I've seen) |
03:26
🔗
|
underscor |
SketchCow: Thoughts? |
03:48
🔗
|
closure |
mobileme is really gonna be 200tb? |
04:11
🔗
|
underscor |
That's the current prediction |
04:57
🔗
|
Coderjoe_ |
shiiit... my aws usage is at $150 already. 50 being bandwidth |
04:57
🔗
|
Coderjoe_ |
i just terminated my m2.xlarge spot instance |
04:58
🔗
|
Coderjoe_ |
had all the data moved out at one pm central but didnt have time to kill it before my connecting flight |
05:57
🔗
|
SketchCow |
Sounds good |
07:00
🔗
|
yipdw |
Splinder's alphabetical listing of names is pretty broken |
07:00
🔗
|
yipdw |
you can start at page 1 of "A", press "last page", and end up on page 2473 of "P" |
07:16
🔗
|
yipdw |
fyi, if anyone's interested -> https://github.com/ArchiveTeam/splinder-grab/blob/master/user_grabber/grabber.rb |
07:56
🔗
|
db48x |
yipdw: heh |
07:56
🔗
|
yipdw |
db48x: I've been poking around for a better way to get the username list |
07:56
🔗
|
yipdw |
doesn't look like one exists, unfortunately |
08:00
🔗
|
Coderjoe_ |
er, 50 being bandwidth? make that nearly 100 for bandwidth |
08:00
🔗
|
db48x |
Coderjoe_: how much to exfiltrate the data? |
08:00
🔗
|
yipdw |
ugh, damnit, Ruby's net/http is useless |
08:00
🔗
|
yipdw |
keeps timing out |
08:00
🔗
|
Coderjoe_ |
well, incoming data is free |
08:01
🔗
|
yipdw |
i'll just rewrite this with curl something |
08:01
🔗
|
tef_ |
yipdw: I feel your pain with built in libraries for http |
08:01
🔗
|
yipdw |
tef_: I'd love to find a language that has a built-in HTTP library that doesn't suck |
08:01
🔗
|
yipdw |
maybe Go? |
08:01
🔗
|
* |
tef_ just wrote a http parser for python |
08:01
🔗
|
Coderjoe_ |
python's urllib2 is pretty decent |
08:01
🔗
|
tef_ |
Coderjoe_: only if you're on crack |
08:01
🔗
|
Coderjoe_ |
you wrote an http parser |
08:01
🔗
|
Coderjoe_ |
er |
08:01
🔗
|
Coderjoe_ |
html |
08:01
🔗
|
tef_ |
doesn't support proxies over https, doesn't handle methods |
08:01
🔗
|
db48x |
I've had lots of success with Perl's WWW::Mechanize |
08:01
🔗
|
tef_ |
yes, http |
08:02
🔗
|
tef_ |
db48x: actually I recall that being ok :3 |
08:02
🔗
|
tef_ |
but yeah urllib2 is clunky and awkward and only handles parsing responses. it also tends to mutilate headers somewhat |
08:02
🔗
|
yipdw |
db48x: Ruby's Mechanize is pretty neat, but internally it uses Net::HTTP |
08:03
🔗
|
yipdw |
technically Net::HTTP::Persistent |
08:03
🔗
|
tef_ |
Coderjoe_: I have to write a http proxy for archiving stuff |
08:03
🔗
|
yipdw |
and I guess under continued use it just kinda...stops workign |
08:03
🔗
|
tef_ |
none of the libraries support this use-case :/ |
08:03
🔗
|
yipdw |
at least, that's what I'm seeing with my work-queue setup |
08:03
🔗
|
Coderjoe_ |
ec2: 52.70, s3: 7.19, bandwidth: in=free, out=90.84 |
08:04
🔗
|
Coderjoe_ |
can't you tie into the httpserver libraries? |
08:04
🔗
|
yipdw |
it's certainly a possibility that my queuing strategy sucks -- I just scribbed out that code |
08:05
🔗
|
Coderjoe_ |
you are doing something that is out of the ordinary, though |
08:05
🔗
|
tef_ |
Coderjoe_: they're spetacularly tied around the sockets |
08:05
🔗
|
tef_ |
I also need to do parsing of http messages /within/ archives |
08:05
🔗
|
tef_ |
most http libraries tie the parsing in with the socket handling :/ |
08:06
🔗
|
db48x |
indeed |
08:06
🔗
|
Coderjoe_ |
again, you are doing something far outside the normal |
08:06
🔗
|
tef_ |
yup |
08:06
🔗
|
Coderjoe_ |
ugh |
08:06
🔗
|
tef_ |
I tried to avoid this |
08:06
🔗
|
Coderjoe_ |
i should probably get to bed |
08:06
🔗
|
tef_ |
Coderjoe_: yeah I am happy I have done it but not happy I had to do it |
08:07
🔗
|
Coderjoe_ |
1000202043392 Jul 27 15:41 WD_elements_1TB_factory.img |
08:07
🔗
|
Coderjoe_ |
755408 Nov 10 16:45 WD_elements_1TB_factory.img.bz2 |
08:09
🔗
|
Coderjoe |
amazing how well nulls compress |
08:11
🔗
|
db48x |
:) |
08:32
🔗
|
yipdw |
oh, wait, it wasn't net/http |
08:32
🔗
|
yipdw |
[ERROR] Error working with http://www.splinder.com/users/list/c/3048: GC overhead limit exceeded |
08:33
🔗
|
alard |
yipwd: Morning, I already have a list of usernames. |
08:33
🔗
|
db48x |
yipdw: interesting |
08:33
🔗
|
yipdw |
oh, good |
08:33
🔗
|
yipdw |
that means I can stop running this, then |
08:33
🔗
|
alard |
Not sure how complete that is, though, since I didn't know about your problem. I'll check in a moment. |
08:33
🔗
|
alard |
I didn't follow 'Next page' links, just downloaded the max number for each letter, then generated urls myself. |
08:34
🔗
|
yipdw |
alard: ok, I'll keep this going, then |
08:34
🔗
|
alard |
It also doesn't include the us version, which has small differences. |
08:35
🔗
|
alard |
As for a tracker, yes, I'll copy the mobileme one. |
08:39
🔗
|
yipdw |
oh wait, haha |
08:39
🔗
|
yipdw |
I forgot, Mechanize keeps history |
08:39
🔗
|
db48x |
heh |
08:39
🔗
|
yipdw |
that might cause a memory leak |
08:47
🔗
|
yipdw |
hah |
08:47
🔗
|
yipdw |
"[ERROR] Error working with http://www.splinder.com/users/list/a/3096: OpenSSL::SSL requires the jruby-openssl gem" |
08:47
🔗
|
yipdw |
oh, Ruby |
08:47
🔗
|
yipdw |
you so crazy |
08:47
🔗
|
db48x |
lol |
08:52
🔗
|
yipdw |
oh, that OpenSSL::SSL thing was due to an autoload |
09:15
🔗
|
yipdw |
alard: out of curiosity, how many users did you find in the A..D range? |
09:15
🔗
|
yipdw |
I'm at 60,218 and counting |
09:16
🔗
|
alard |
a: 634044 |
09:16
🔗
|
alard |
b: 158687 |
09:16
🔗
|
alard |
c: 265381 |
09:16
🔗
|
alard |
d: 222466 |
09:16
🔗
|
yipdw |
yeesh |
09:17
🔗
|
alard |
Oh, sorry. |
09:18
🔗
|
alard |
a: 72023 |
09:18
🔗
|
alard |
b: 47084 |
09:18
🔗
|
alard |
c: 61775 |
09:18
🔗
|
alard |
d: 48070 |
09:18
🔗
|
yipdw |
hmm |
09:18
🔗
|
alard |
(Forgot to add ^ to the grep.) |
09:18
🔗
|
yipdw |
ah ok |
09:18
🔗
|
alard |
928663 usernames.txt |
09:18
🔗
|
alard |
It didn't match with my wc -l usernames.txt |
09:21
🔗
|
alard |
ls -1 user-list-pages/ | wc -l |
09:21
🔗
|
alard |
38712 |
09:21
🔗
|
alard |
http://dl.dropbox.com/u/365100/splinder-usernames.txt.bz2 |
09:22
🔗
|
yipdw |
ok, neat |
09:22
🔗
|
yipdw |
I'll go ahead and kill this scraper |
09:22
🔗
|
alard |
Maybe check if your scraper has found anything that my scraper hasn't. |
09:22
🔗
|
yipdw |
yeah, doing that now |
09:28
🔗
|
yipdw |
alard: https://gist.github.com/3dc9b41ed9a04984d8cf |
09:28
🔗
|
yipdw |
alard: some of them are definitely false negatives, like the ones with leading spaces |
09:28
🔗
|
yipdw |
but others, dunno |
09:32
🔗
|
yipdw |
alard: here's that list after stripping spaces and re-comparing -> https://gist.github.com/fa243dbc5371130a098b |
09:32
🔗
|
yipdw |
it's an open question whether or not those 220 usernames actually have Splinder data |
09:34
🔗
|
yipdw |
be back in the morning; gonna crash |
09:35
🔗
|
alard |
Okay, bye. |
13:13
🔗
|
bbot_ |
sent an email to archiveofourown |
13:13
🔗
|
bbot_ |
http://bbot.org/blog/archives/2011/11/11/guerilla_archiving_i/ |
13:13
🔗
|
bbot_ |
let's see if they respond |
13:32
🔗
|
Schbirid |
one hour until i delete the emuwiki torrent material unless someone shouts he wants it |
13:32
🔗
|
Schbirid |
i dont remember who did so weeks ago |
13:32
🔗
|
Cowering |
is that newer than the one on piratebay? |
13:34
🔗
|
Schbirid |
link? |
13:34
🔗
|
Cowering |
lemme find it.. got it at least a year ago |
13:35
🔗
|
Schbirid |
~13GB |
13:36
🔗
|
Cowering |
loads of folders, each with different veresions of same emu? |
13:37
🔗
|
Cowering |
i had gotten his torrent like a week after he mailed me saying he was shutting down, since he knew i archived emu stuff.. dunno if anything got added after that |
13:37
🔗
|
Cowering |
there is a wiki in there with a detailed list of addons needed to recreate it |
13:38
🔗
|
Schbirid |
http://www.quaddicted.com:26000/temp.txt |
13:38
🔗
|
Cowering |
yup, looks very familiar :) |
13:40
🔗
|
Cowering |
is the wiki dump itself in that list? |
13:40
🔗
|
Schbirid |
no idea |
13:42
🔗
|
Cowering |
there was a folder with (I think) mediawiki addons in it |
13:43
🔗
|
Schbirid |
no such thing in this |
13:44
🔗
|
Cowering |
ah, you have his files/ folder.. i have this too : |
13:44
🔗
|
Cowering |
06/02/2010 11:43 PM 289,634,843 EmuWiki MediaWiki Database Final.sql |
13:44
🔗
|
Cowering |
1 File(s) 289,634,843 bytes |
13:44
🔗
|
Cowering |
Directory of e:\_emulation\CollectionFinale\Database |
13:45
🔗
|
Cowering |
on TPB is was called 'EMUWiki Collection Torrent' but seems gone now |
13:46
🔗
|
Schbirid |
this one was called v0.2 of that i think |
13:47
🔗
|
Cowering |
i've got : |
13:48
🔗
|
Cowering |
05/07/2010 08:16 AM 467,938 Collection_2.3.rar |
13:48
🔗
|
Cowering |
05/07/2010 08:16 AM 479,473 Collection_2.4.rar |
13:48
🔗
|
Cowering |
but no clue if that is part of wiki (though it is not in files/ folder) |
13:49
🔗
|
Cowering |
i'll have em if anyone cares, but 18 months old is way back in the stored HD pile |
13:54
🔗
|
Cowering |
whoa.. did i say 'way back' lol |
13:54
🔗
|
Cowering |
or is it 'wayback' around here |
13:55
🔗
|
Coderjoe |
i don't know, Mr. Peabody |
14:53
🔗
|
underscor |
Schbirid: Do you know if the archive has it? |
14:54
🔗
|
Schbirid |
no idea but i rmred |
14:54
🔗
|
Schbirid |
torrent is at undergroundgamers |
15:05
🔗
|
alard |
Hey all: Some updates to the Splinder scripts, now with a distributed client. If you can help to do a practice run, please see https://github.com/ArchiveTeam/splinder-grab |
15:06
🔗
|
alard |
(Important: If you have already used the previous versions of the script, please rm -rf data/ first.) |
15:15
🔗
|
underscor |
alard: Running |
15:16
🔗
|
alard |
underscor: Please git pull once more. |
15:16
🔗
|
alard |
Just found a few more errors. |
15:16
🔗
|
alard |
Your downloads should appear here, if everything works ok: http://splinder.heroku.com/ |
15:17
🔗
|
underscor |
Sweet! |
15:21
🔗
|
underscor |
alard: It works! |
15:22
🔗
|
underscor |
- Running wget --mirror (at least 28118 files) |
15:22
🔗
|
underscor |
:( |
15:22
🔗
|
underscor |
They're big files too |
16:18
🔗
|
alard |
underscor: wget --mirror (at least 28118 files), that's mobileme, I hope? |
16:42
🔗
|
underscor |
alard: Yeah |
16:42
🔗
|
underscor |
haha |
17:23
🔗
|
underscor |
alard: Are the splinder scripts in a final enough state that I can run multiple clients without worrying about needed to delete? |
17:50
🔗
|
yipdw |
ha, speaking of mobileme |
17:50
🔗
|
yipdw |
- Running wget --mirror (at least 40425 files)... |
17:50
🔗
|
yipdw |
arg |
18:00
🔗
|
yipdw |
hmm |
18:01
🔗
|
yipdw |
alard: the way splinder-grab HEAD uses cut(1) seems to be incompatible with BSD cut; I'm looking into it |
18:01
🔗
|
yipdw |
alard: tracking and downloading seem ok, though |
18:05
🔗
|
yipdw |
oh, ha, and stat(1) is also different |
18:05
🔗
|
yipdw |
ugh |
18:12
🔗
|
yipdw |
actually, you know what -- instead of writing wrappers around stat, cut, du, etc., I think it'll just be easier to recommend OS X/FreeBSD/whatever users to just set up symlinks to the GNU utilities and adjust the PATH for the scripts |
18:35
🔗
|
Paradoks |
Looking at the graph, I think my splinder instance has been working on girodiboa for 2.5 hours. It seems somewhat stuck on soluzioni.splinder.com . It does seem somewhat large, though. |
18:37
🔗
|
alard |
Paradoks: soluzioni.splinder.com is the splinder help section. |
18:38
🔗
|
yipdw |
ha, oops |
18:38
🔗
|
alard |
Well, soluzioni.splinder.com is listed as a 'group blog' on the girodiboa list: http://www.splinder.com/profile/girodiboa/blogs |
18:39
🔗
|
alard |
Maybe we should separate the blog download from the profile downloads? |
18:39
🔗
|
alard |
Blogs sometimes belong to one user, but sometimes they are shared. |
18:39
🔗
|
Paradoks |
Okay, it seems reasonable that it's taking a while. |
18:40
🔗
|
Paradoks |
So, theoretically, the current way of doing things could make this be downloaded hundreds of times? |
18:42
🔗
|
yipdw |
Paradoks: yeah, I think so |
18:42
🔗
|
alard |
Yes. |
18:42
🔗
|
yipdw |
I'm not sure if that's necessarily a bad thing, though |
18:43
🔗
|
yipdw |
I mean, if you're looking to get archives of each user, it definitely makes it easier to make the user the root, duplication be damned |
18:43
🔗
|
Paradoks |
I suppose. Still, with only 13 days to get stuff, we'd need to be averaging about a profile a second, I think. |
18:43
🔗
|
alard |
https://encrypted.google.com/search?ie=UTF-8&q=site%3Awww.splinder.com%2Fprofile%2F*%2Fblogs+soluzioni |
18:44
🔗
|
yipdw |
hm |
18:44
🔗
|
yipdw |
I don't suppose there's a way to figure out (1) what group blog has the largest number of members and (2) which group blog is the largest |
18:44
🔗
|
yipdw |
also, http://soluzioni.splinder.com/post/25747335#comment |
18:47
🔗
|
alard |
http://soluzioni.splinder.com/post/25737683/avviso-per-gli-utenti-ce-da-preoccuparsi |
18:48
🔗
|
alard |
My Italian is not good enough to make sense of that, other than guessing at the topic. |
18:49
🔗
|
yipdw |
neither is Google Translate |
18:50
🔗
|
alard |
No that's funny, you get nonsense when you do that. |
18:50
🔗
|
alard |
Here's the list of people sharing a blog called 'irishmist': https://encrypted.google.com/search?ie=UTF-8&q=site%3Awww.splinder.com%2Fprofile+irishmist |
18:52
🔗
|
alard |
I get the impression that group blogs are owned by someone, and that the others are 'invitati'. |
18:53
🔗
|
yipdw |
I'm asking an Italian native to see if he can translate some of these links |
18:53
🔗
|
yipdw |
one second |
18:53
🔗
|
alard |
Here's only one person with the irishmist blog listed as his own blogs: https://encrypted.google.com/search?ie=UTF-8&q=site%3Awww.splinder.com%2Fprofile+irishmist+*+*+invitati |
18:54
🔗
|
alard |
Similarly, the soluzioni blog seems to be owned by the 'Redazione': https://encrypted.google.com/search?ie=UTF-8&q=site%3Awww.splinder.com%2Fprofile+soluzioni+*+invitati |
18:57
🔗
|
yipdw |
alard: also, do the splinder-grab scripts grab things like http://www.splinder.com/mediablog/compagnidiviaggio ? |
18:57
🔗
|
alard |
Yes. |
18:57
🔗
|
yipdw |
ok |
18:57
🔗
|
yipdw |
I think it does make sense to split profile and blog download, then |
18:57
🔗
|
yipdw |
because that cdv mediablog is (1) pretty big and (2) shared |
18:58
🔗
|
alard |
Or limit the blog download to the non-groupblogs. |
18:58
🔗
|
yipdw |
yeah, that too |
18:58
🔗
|
yipdw |
and run group blogs separately |
18:58
🔗
|
alard |
No, maybe we can keep a list of them, but I think it's not even necessary to download those separately. |
18:58
🔗
|
alard |
So far, every group blog is owned by someone. |
18:59
🔗
|
yipdw |
oh, ok |
18:59
🔗
|
alard |
It's listed under 'my blogs' for one user and under 'group blogs' for everyone else. |
19:00
🔗
|
yipdw |
ok, yeah, that sounds fine -- let's just log them for now, and then check if any inconsistencies show up when we near the end |
19:00
🔗
|
yipdw |
er, not so much "inconsistencies" as "dangling references" |
19:01
🔗
|
alard |
Yes, so just download the owned blogs for now? Or is it still useful to split profiles and blogs? |
19:01
🔗
|
alard |
(I think it's easier not to split.) |
19:01
🔗
|
yipdw |
seems like just downloading the owned blogs and logging group blogs is an easier modification |
19:01
🔗
|
alard |
Okay. I'll have a look at that. I'll be back later. |
19:01
🔗
|
yipdw |
np |
19:02
🔗
|
yipdw |
if any straggler blogs are left, we can throw underscor and his massive pipe at it, or something |
19:11
🔗
|
yipdw |
well, preliminary gist of http://soluzioni.splinder.com/post/25737683/avviso-per-gli-utenti-ce-da-preoccuparsi |
19:11
🔗
|
yipdw |
from an Italian guy who goes by the handle mirkosp in a different channel |
19:11
🔗
|
yipdw |
13:10:07 <mirkosp> splinderpro isn't available anymore since june |
19:11
🔗
|
yipdw |
13:10:08 <mirkosp> and |
19:11
🔗
|
yipdw |
13:10:25 <mirkosp> splinder was sold by dada to populis |
19:11
🔗
|
yipdw |
13:10:40 <mirkosp> splinder isn't closing if that was your concern |
19:16
🔗
|
underscor |
Yay, first place again on splinder |
19:17
🔗
|
Paradoks |
...so what exactly _is_ being shut down? |
19:19
🔗
|
yipdw |
I don't know |
19:53
🔗
|
alard |
http://perennementesloggata.wordpress.com/2011/11/11/splinder-chiude-redazione-assente-blogger-in-panico/ |
19:55
🔗
|
SketchCow |
Thanks for the SciAm, coderjoe |
19:55
🔗
|
SketchCow |
And thanks, alard, I just put up the AOL LISTSERV site |
20:00
🔗
|
underscor |
http://imgur.com/gallery/F0ld5 |
20:01
🔗
|
alard |
Your own work? |
20:02
🔗
|
underscor |
Not that particular version |
20:03
🔗
|
underscor |
But there might be one at mt school |
20:03
🔗
|
alard |
Heh. |
20:03
🔗
|
underscor |
:> |
20:03
🔗
|
underscor |
Dammit, fun dip is like fruity cocaine |
20:03
🔗
|
underscor |
I CAN'T STOP |
20:05
🔗
|
yipdw |
a good way to stop is to imagine the diabetes-induced pain that will follow |
20:06
🔗
|
underscor |
haha |
20:06
🔗
|
underscor |
Hey, at least SketchCow got me to stop drinking faygo |
20:17
🔗
|
Paradoks |
underscor: By the way, you're the idling king, as is evidenced by being the only remaining op in #archiveteam-idle. |
20:17
🔗
|
underscor |
Good catch |
20:17
🔗
|
underscor |
;P |
20:18
🔗
|
alard |
underscor, Paradoks, yipdw: Could you stop your splinder scripts? Time for the real work in a moment. |
20:18
🔗
|
underscor |
alard: Rough stop okay? |
20:18
🔗
|
yipdw |
ok |
20:18
🔗
|
alard |
Yes. |
20:18
🔗
|
Paradoks |
alard: I already touched STOP, but it's still going on soluzioni. CTRL-C time? |
20:18
🔗
|
alard |
Yes, and rm -rf, if everything seems ok. |
20:18
🔗
|
Paradoks |
Okay. Will do. |
20:18
🔗
|
underscor |
alard: Done |
20:19
🔗
|
alard |
I'll post a new version of the script in a moment, which also includes www.us.splinder.com, works better with blogs etc. |
20:19
🔗
|
underscor |
\o/ |
20:19
🔗
|
yipdw |
rm -rf'd |
20:19
🔗
|
alard |
Thanks. |
20:23
🔗
|
SketchCow |
STOP DRINKING FAYGO |
20:23
🔗
|
SketchCow |
JESUS CHRIST |
20:23
🔗
|
SketchCow |
NEVER EVEN REMEMBER YOU ONCE DID |
20:24
🔗
|
Paradoks |
Does this mean we can blank out ever having listened to the Insane Clown Posse song? |
20:24
🔗
|
SketchCow |
YES |
20:24
🔗
|
SketchCow |
ICP: DELETE THAT SHIT - ArchiveTeam |
20:24
🔗
|
Paradoks |
Heheh. |
20:29
🔗
|
chronomex |
why hello there |
20:33
🔗
|
alard |
underscor, Paradoks, yipdw: Splinder is ready to go. |
20:33
🔗
|
yipdw |
cool |
20:33
🔗
|
underscor |
SketchCow: IT'S SO FUCKING DELICIOUS |
20:34
🔗
|
underscor |
SketchCow: NOT AS TASTY AS GINGER BEER AND NESQUIK |
20:34
🔗
|
underscor |
:F |
20:34
🔗
|
underscor |
:D* |
20:34
🔗
|
yipdw |
off we go! |
20:34
🔗
|
underscor |
Although, :F was more like the face I made with the ginger beer and that bigass burrity |
20:34
🔗
|
underscor |
burrito* |
20:34
🔗
|
underscor |
God damn it |
20:35
🔗
|
chronomex |
burrity |
20:35
🔗
|
underscor |
EATIN' BURRITIES WITH SketchCow |
20:35
🔗
|
yipdw |
ON THE SCOREBOARD |
20:36
🔗
|
underscor |
ON THE SCOREBOARD |
20:36
🔗
|
underscor |
alard: Seems faster now |
20:37
🔗
|
underscor |
http://imgur.com/gallery/1uQG6 |
20:38
🔗
|
chronomex |
lol |
20:38
🔗
|
underscor |
MUST |
20:38
🔗
|
underscor |
BEAT |
20:38
🔗
|
underscor |
alard |
20:39
🔗
|
chronomex |
shit I fell off the bottom of the graph |
20:39
🔗
|
underscor |
lol |
20:39
🔗
|
chronomex |
alard: I demand you add me back onto the graph, by upping the number of lines drawn. |
20:40
🔗
|
chronomex |
if you're gonna show 10 people on the page, show us on the graph damnit |
20:40
🔗
|
chronomex |
hm. maybe my shit stopped overnight. |
20:41
🔗
|
underscor |
lol |
20:41
🔗
|
underscor |
Yes! Back in the lead |
20:42
🔗
|
yipdw |
I want to see what percentage of Splinder is grabbed by underscor |
20:42
🔗
|
underscor |
:D |
20:42
🔗
|
underscor |
alard: Need fractional MB estimation, hehee |
20:43
🔗
|
underscor |
alard: What do you use for the backend? Redis? |
20:43
🔗
|
underscor |
yipdw's quickly on the rise for mobileme |
20:43
🔗
|
yipdw |
whoa |
20:43
🔗
|
yipdw |
when did that happen |
20:43
🔗
|
yipdw |
I'm not actually watching those processes |
20:44
🔗
|
underscor |
Looks like everyone stopped |
20:44
🔗
|
underscor |
hahah |
20:44
🔗
|
underscor |
We have a bit more time for mobileme, right? |
20:44
🔗
|
underscor |
December something? |
20:44
🔗
|
yipdw |
June 2012 or something |
20:45
🔗
|
yipdw |
splinder's really slow for me for some reason |
20:45
🔗
|
yipdw |
oh, because I'm going through Cogent |
20:46
🔗
|
underscor |
Oh yeah, june 30th 2012 |
20:46
🔗
|
underscor |
Plenty of time! |
20:46
🔗
|
yipdw |
and everything that passes through Cogent gets the suck bit set |
20:46
🔗
|
underscor |
Although 200TB will still take a bit |
20:46
🔗
|
underscor |
of time |
20:46
🔗
|
underscor |
suck bit? |
20:46
🔗
|
underscor |
haha |
20:46
🔗
|
yipdw |
I'm not sure how to interpret traceroute output like this |
20:46
🔗
|
yipdw |
6 te0-3-0-1.ccr22.bos01.atlas.cogentco.com (154.54.24.57) 91.882 ms |
20:46
🔗
|
yipdw |
te0-0-0-2.ccr22.bos01.atlas.cogentco.com (154.54.43.202) 92.123 ms |
20:46
🔗
|
yipdw |
te0-5-0-6.ccr22.bos01.atlas.cogentco.com (154.54.45.242) 91.795 ms |
20:47
🔗
|
underscor |
haha |
20:47
🔗
|
yipdw |
what the hell does that mean, each packet came back from a different router? |
20:47
🔗
|
underscor |
Each time it goes through a different router |
20:47
🔗
|
underscor |
Yep |
20:47
🔗
|
yipdw |
jeez |
20:47
🔗
|
underscor |
mtr>traceroute |
20:47
🔗
|
chronomex |
http://www.internet2.edu/lsr/ |
20:47
🔗
|
Paradoks |
I'm trying to limit my bandwidth usage for a week or two. Thankfully, splinder is not bandwidth intensive unless you run it like underscor. |
20:47
🔗
|
chronomex |
we need to get on this bandwagon |
20:48
🔗
|
underscor |
chronomex: haha |
20:48
🔗
|
underscor |
Paradoks: 32 clients atm |
20:48
🔗
|
underscor |
:D:D:D:D:D:D:D:D |
20:48
🔗
|
Paradoks |
Heheh. |
20:49
🔗
|
yipdw |
huh |
20:50
🔗
|
underscor |
alard: how hard would it to be to add a user graph too |
20:50
🔗
|
underscor |
user count* |
20:50
🔗
|
yipdw |
if Cogent labels their router by airport codes, I'm going through Boston -> Liverpool -> Amsterdam -> Frankfurt -> Paris -> Italy |
20:50
🔗
|
underscor |
Dang, that's a crappy route |
20:50
🔗
|
yipdw |
my packets paid for the European tour |
20:51
🔗
|
yipdw |
and since Italy is not a city |
20:51
🔗
|
underscor |
I'm going ISC->New York->Paris->Milan->Italy |
20:51
🔗
|
yipdw |
replace Italy with Milan |
20:52
🔗
|
underscor |
Well, I guess the last 2 are redundant |
20:52
🔗
|
underscor |
It goes Milan->dadah, whatever that is |
20:52
🔗
|
underscor |
Funny, splinder.com doesn't reply to pings |
20:53
🔗
|
yipdw |
maybe they're of the mindset that pings are evil |
20:53
🔗
|
yipdw |
for some reason |
20:53
🔗
|
underscor |
Pulling 6Mbps from splinder |
20:53
🔗
|
underscor |
Aha, dada.it (dadah) is the server company |
20:53
🔗
|
yipdw |
former owner of Splinder, too |
20:54
🔗
|
underscor |
Funny it's still hosted on their servers |
20:54
🔗
|
underscor |
(or at least they haven't updated the rdns) |
20:54
🔗
|
underscor |
http://tracker.archive.org/tracker.png ha, guess where I stopped mobileme? |
20:55
🔗
|
chronomex |
bout 90 minutes ago? |
20:55
🔗
|
chronomex |
no, 0400 |
20:55
🔗
|
underscor |
Yep |
20:55
🔗
|
underscor |
hahah |
20:55
🔗
|
underscor |
90 minutes ago was when a large download finished |
20:55
🔗
|
underscor |
and then it was ingested into the archive |
20:56
🔗
|
yipdw |
oh |
20:56
🔗
|
yipdw |
so I just tried from this VPS |
20:56
🔗
|
yipdw |
which is in some Sunnyvale, CA datacenter |
20:56
🔗
|
yipdw |
maybe hE |
20:56
🔗
|
underscor |
http://tracker.archive.org/batcave.png |
20:56
🔗
|
yipdw |
the route is way better |
20:56
🔗
|
underscor |
SketchCow is losing the data throughput war |
20:56
🔗
|
yipdw |
kinda figures, I guess |
20:56
🔗
|
underscor |
Yeah |
20:57
🔗
|
underscor |
Archive is back to 116TB free, \o/ |
20:57
🔗
|
yipdw |
I have a hard time comprehending the magnitude of that number |
20:57
🔗
|
yipdw |
so instead, I'm going to get lunch |
20:57
🔗
|
yipdw |
brb |
20:57
🔗
|
underscor |
lol |
20:58
🔗
|
underscor |
http://blogs.msdn.com/b/oldnewthing/archive/2011/11/11/10235970.aspx |
21:07
🔗
|
underscor |
Nearly 1000 done |
21:07
🔗
|
underscor |
Then there's only 927k more to go! |
21:08
🔗
|
dnova |
116TB free? |
21:08
🔗
|
dnova |
I'll take it! |
21:08
🔗
|
underscor |
haha |
21:09
🔗
|
underscor |
Everyone's a little worried at the disk price hike |
21:09
🔗
|
dnova |
yes it's a big problem :| |
21:10
🔗
|
dnova |
500+ people have died also |
21:12
🔗
|
dnova |
glad I am not in need of any more storage right now, but I sell/install surveillance systems on the side and I have to absorb some of the extra storage cost |
21:12
🔗
|
underscor |
:( |
21:13
🔗
|
dnova |
best buy has a 4tb usb drive for $200 right now |
21:14
🔗
|
dnova |
probably won't last long at that price |
21:15
🔗
|
bsmith093 |
i cant wait until i can just hand out 8tb usb sticks like floppies used to be spammed everywhere, here have every song from the last 120 years, on this thing the size of a piece of gum |
21:15
🔗
|
SketchCow |
I WILL WIN THE WAR |
21:15
🔗
|
SketchCow |
enjoy your battle win, tyke |
21:16
🔗
|
underscor |
SketchCow: YOU SHALL LOSE FOREVER |
21:16
🔗
|
underscor |
REMEMBER, I HAVE MORE LIFE LEFT THAN YOU |
21:16
🔗
|
chronomex |
not for long! |
21:16
🔗
|
yipdw |
not if you keep drinking Faygo and eating Fun Dip |
21:16
🔗
|
* |
chronomex brandishes a pike |
21:17
🔗
|
SketchCow |
Seriously |
21:17
🔗
|
SketchCow |
Right into the ground |
21:17
🔗
|
underscor |
<Nocturnophil3> _habnabit: I encode all my music in lossless, but it gets me a lot of flac D: |
21:17
🔗
|
dnova |
heh |
21:17
🔗
|
underscor |
I stopped the faygo! |
21:18
🔗
|
underscor |
Except on the first friday of the month |
21:18
🔗
|
underscor |
I have a diet faygo root beer |
21:18
🔗
|
chronomex |
you drink faygo at 2600? |
21:18
🔗
|
underscor |
?? |
21:18
🔗
|
underscor |
Oh, no |
21:18
🔗
|
underscor |
I don't attend 2600 meetings |
21:18
🔗
|
chronomex |
ok |
21:19
🔗
|
underscor |
8 hours left on this rsync! |
21:19
🔗
|
underscor |
God, reading from a nilfs2 parition SUCKS |
21:19
🔗
|
underscor |
Write performance is great though! |
21:21
🔗
|
chronomex |
that's what it's for, man |
21:21
🔗
|
chronomex |
wait what |
21:22
🔗
|
chronomex |
http://www.nilfs.org/en/ |
21:22
🔗
|
chronomex |
nilfs comes from NTT?!? |
21:22
🔗
|
underscor |
Is that bad? |
21:22
🔗
|
yipdw |
all the best stuff is made in Japan |
21:22
🔗
|
dnova |
glorious nippon |
21:23
🔗
|
chronomex |
all the best stuff comes from telcos |
21:23
🔗
|
chronomex |
NILFS, Erlang, X.25, ??? |
21:23
🔗
|
yipdw |
UNIX |
21:23
🔗
|
yipdw |
well, sort of |
21:23
🔗
|
chronomex |
^ |
21:23
🔗
|
chronomex |
transistors |
21:24
🔗
|
yipdw |
telcos did not, however, invent kosher salt |
21:24
🔗
|
chronomex |
bacon salt, however .. |
21:25
🔗
|
underscor |
... |
21:25
🔗
|
underscor |
hahaha |
21:58
🔗
|
underscor |
2000 users! |
21:58
🔗
|
underscor |
At this rate we'll be done in no time |
22:36
🔗
|
alard |
Hello, back for a moment. The splinder count changes much quicker than the mobileme count, I see. |
23:45
🔗
|
underscor |
alard: Well, they're a lot smaller |
23:45
🔗
|
underscor |
:) |
23:46
🔗
|
alard |
underscor: Yes, maybe that has something to do with it. |
23:46
🔗
|
alard |
A 'users done' graph for those who reload, by the way. |
23:47
🔗
|
DFJustin |
so any plans to archive uudisc.com |
23:47
🔗
|
underscor |
alard: Yay! |
23:49
🔗
|
underscor |
it:smackmybitch |
23:49
🔗
|
underscor |
hahahaha |
23:50
🔗
|
underscor |
DFJustin: Ah chinese |
23:50
🔗
|
underscor |
DFJustin: Can you read it? |
23:53
🔗
|
underscor |
http://www.uudisc.com/user/Qiki0937/file/search?q= |
23:53
🔗
|
underscor |
Should be easy to do |
23:53
🔗
|
underscor |
Scrape usernames from google, then get all the files |
23:54
🔗
|
underscor |
Office parties, trendy owners, blog authors, professionals. |
23:54
🔗
|
underscor |
uushare.com user base: |
23:54
🔗
|
underscor |
hahaha |
23:57
🔗
|
underscor |
We can find more users using the "neighbors" feature too |
23:57
🔗
|
underscor |
http://www.uudisc.com/user/qiki0937/friend/namelist |