Time |
Nickname |
Message |
00:16
🔗
|
wp494 |
ooooooh shiiiit |
00:16
🔗
|
wp494 |
http://www.theverge.com/2013/5/17/4342012/yahoo-reportedly-nearing-1-1-billion-deal-to-acquire-tumblr |
00:17
🔗
|
wp494 |
is it already on fire drill? |
00:26
🔗
|
DFJustin |
we've already done proof-of-concept grabs in the past https://archive.org/details/archiveteam-tumblr-test-warc |
00:26
🔗
|
DFJustin |
we straight up don't have enough storage to hold tumblr though |
02:21
🔗
|
cmx |
so much porn |
02:27
🔗
|
godane |
i'm finally finding more call for help episodes |
02:28
🔗
|
godane |
in dialup format of course |
05:39
🔗
|
SketchCow |
Hey. |
05:46
🔗
|
SketchCow |
Everyone wants us to back up tumblr. |
05:46
🔗
|
SketchCow |
I may make some noise for attention, but we have lots of time. |
05:46
🔗
|
SketchCow |
I wonder how big it is. |
06:09
🔗
|
BlueMax |
Can we even get a rough estimate? |
06:10
🔗
|
omf_ |
Of course. We just sample the whole |
06:10
🔗
|
omf_ |
Pick a few hundred random blogs and download them |
09:51
🔗
|
tryphon |
Is that a good idea to run multipe VM of the Warrior? |
09:51
🔗
|
tryphon |
(on the same computer/IP I mean) |
09:56
🔗
|
Smiley |
you can, |
09:56
🔗
|
Smiley |
not a problem at all, just needs lots of resources. |
09:56
🔗
|
Smiley |
tryphon: you on windows or linux? |
09:56
🔗
|
tryphon |
os x |
09:56
🔗
|
Smiley |
ah :D |
09:56
🔗
|
tryphon |
10.7.4 |
09:57
🔗
|
Smiley |
then yeah multiple warriors is your easiest way if you want to do that. |
09:57
🔗
|
tryphon |
But I also have a synology nas (powerpc though) |
09:57
🔗
|
Smiley |
well yuo can try and run the scripts directly |
09:57
🔗
|
Smiley |
instructions are on our wiki. |
09:58
🔗
|
tryphon |
will have a look ;) thx |
10:02
🔗
|
tryphon |
hmm two VM that target the same port 8001 is ok? |
10:06
🔗
|
Tomcat_ |
tryphon: Don't think so. |
10:09
🔗
|
Tomcat_ |
Either you won't be able to access one warrior's web interface, or one might not work at all. |
10:09
🔗
|
Tomcat_ |
Not sure how it behaves. |
10:10
🔗
|
tryphon |
It seems that the second VM doesn't have any network activity at all. |
10:10
🔗
|
tryphon |
And we can not (easely) change the port, right. |
10:10
🔗
|
tryphon |
-. +? |
10:14
🔗
|
GLaDOS |
I believe virtualbox has an option to route the port to a different port somewhere |
10:28
🔗
|
HeaD |
or you change the port in /home/warrior/warrior-code2/warrior-runner.sh |
10:28
🔗
|
tryphon |
@GLaDOS fount it :) 1/ Go to VM prefs > network > adaptater 1 - http://imgur.com/NPsa0He |
10:29
🔗
|
tryphon |
then go to "port forwading" and change the "host port" - http://imgur.com/NPsa0He |
10:29
🔗
|
tryphon |
oops, first picture would have been http://imgur.com/r8jrs7d sorry |
11:07
🔗
|
Smiley |
yeah you change it in port forwarding |
11:07
🔗
|
Smiley |
don't change the warrior code, i think it'll revert at next update. |
14:45
🔗
|
lart |
:( uploading a 4GB item at 50kB/s |
17:03
🔗
|
Smiley |
D: |
18:06
🔗
|
SketchCow |
I successfully torrented the TOSEC-PIX |
18:09
🔗
|
godane |
cool |
18:14
🔗
|
SketchCow |
Yeah |
18:15
🔗
|
SketchCow |
So, I'm going to make the decision to unpack it and install it. |
19:01
🔗
|
antomatic |
Tumblr is interesting but enormous. If you assume a similar average size of material per-user as Posterous (not necessarily safe) - say 2mb per user - then it's immediately 200TB before you even start. |
19:02
🔗
|
antomatic |
Could easily be several times that, or worse. |
19:04
🔗
|
antomatic |
/me steps out to buy some extra USB sticks, just in case |
19:04
🔗
|
antomatic |
:) |
19:08
🔗
|
antomatic |
That grab from 2010 averages at 72mb per user - so 7200TB |
19:08
🔗
|
godane |
i grabbed g4tv tumbler |
19:08
🔗
|
godane |
that was over 400mb |
19:09
🔗
|
godane |
if i remember right |
19:12
🔗
|
antomatic |
Mm, there are some enormous posterous blogs too, but the raw average is so low purely because there's so many users whose entries are tiny, or text-only, or spam. |
19:13
🔗
|
antomatic |
Assume Moore's Law and say that user's Tumblrs double in size every 18 months (bigger/better pictures, more content, video, etc.) |
19:13
🔗
|
antomatic |
That's 288mb per user, or 28 petabytes. |
19:13
🔗
|
antomatic |
14 times the size of the entire Wayback machine. |
19:14
🔗
|
antomatic |
Wait, this can't be right.. |
19:14
🔗
|
omf_ |
you just realized that |
19:14
🔗
|
SketchCow |
ha ha |
19:14
🔗
|
godane |
most video is likely links to youtube |
19:15
🔗
|
antomatic |
If you do decide to archive tumblr, I think I'll be washing my hair that day. But good luck with that. :) |
19:16
🔗
|
godane |
the most likely tumbler accouts to archive if its to big is to go after ones that are linked to alot of wikis |
19:18
🔗
|
antomatic |
"All tumblrs are equal, but some are more celebrilicious than others." :) |
19:18
🔗
|
godane |
cause there most likely have a lot of info that is good |
19:18
🔗
|
godane |
agree but at 28pb its like backing up facebook or youtube |
19:18
🔗
|
godane |
its just not going to be alot of it |
19:18
🔗
|
DFJustin |
antomatic: not impossible, megaupload had 28pb when it went down |
19:18
🔗
|
DFJustin |
facebook has over 100pb of storage now |
19:19
🔗
|
antomatic |
Coo.. |
19:19
🔗
|
antomatic |
Even keeping up with the new content (about 70 million new posts each day) would be a stretch. |
19:19
🔗
|
antomatic |
It does sound like fun, though.. :) |
19:20
🔗
|
godane |
it almost have go like geocities did |
19:21
🔗
|
godane |
be mostly died in 2 years and stay up for the next 11 years after that |
19:25
🔗
|
godane |
i uploaded some more tekzilla episodes |
19:29
🔗
|
godane |
SketchCow: I got a interview of Richard Garriott |
19:32
🔗
|
SketchCow |
Great |
20:03
🔗
|
omf_ |
Tumblr has 107.8 million blogs adn 50.6 billion posts |
20:07
🔗
|
godane |
interview with Michael Limbar from Angel Studios |
20:07
🔗
|
godane |
*Limber |
20:27
🔗
|
godane |
i found a 2 part interview with Yu Suzuki |
20:28
🔗
|
godane |
and a intereview with Kevin Eastman |
23:55
🔗
|
link343 |
WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD |
23:58
🔗
|
DFJustin |
'yahoosucks' |
23:58
🔗
|
link343 |
thank you fair sir |