#archiveteam 2012-05-17,Thu

↑back Search

Time Nickname Message
02:10 πŸ”— kim__ Not on freenode?
02:10 πŸ”— * kim__ scratches head
02:11 πŸ”— kim__ I spotted folks wanting to archive knol, call for help on wikimedia-l
02:11 πŸ”— kim__ Anyone in atm, or should I come back during .eu or .us daytime?
02:11 πŸ”— dnova there's always people around
02:14 πŸ”— kim__ hmm, knol seems to already be down
02:14 πŸ”— kim__ what can still be done to help?
02:15 πŸ”— kim__ The call for help was posted on April 30
02:16 πŸ”— kim__ Anyone I can talk with or so?
02:16 πŸ”— dnova if I understand correctly, only owners of the content can download the content
02:16 πŸ”— dnova until oct 1
02:17 πŸ”— kim__ hmm, http://web.archive.org/web/20110722190349/http://knol.google.com/k
02:20 πŸ”— kim__ apparantly google let IA crawl knol aok. What's the value-add of #archiveteam. Are you with IA, or are you separate?
02:20 πŸ”— kim__ And even if I can't help with this, I can always run a torrent box or etc if that's any use?
02:20 πŸ”— kim__ (or run wget or do ftp operations, and I don't mind poking relevant site operators first ;-)
02:20 πŸ”— dnova we are separate from IA
02:21 πŸ”— * kim__ listens
02:21 πŸ”— dnova www.archiveteam.org
02:21 πŸ”— kim__ I'm already reading there, hence the question :-)
02:21 πŸ”— dnova a lot of or most of what we grab ends up at IA
02:22 πŸ”— kim__ fair enough
02:22 πŸ”— kim__ also, define "a lot" of online storage?
02:23 πŸ”— dnova in what context
02:23 πŸ”— kim__ in any context that you might find useful
02:23 πŸ”— * kim__ isn't sure. just reading http://archiveteam.org/index.php?title=Who_We_Are
02:24 πŸ”— kim__ "People with Lots of Hosted Disk Space"
02:24 πŸ”— dnova oh
02:24 πŸ”— dnova not sure how great a need there is for that right this moment
02:24 πŸ”— kim__ if you have a NAS sitting out someplace at a hosting provider, that'll work fine
02:25 πŸ”— dnova if you'd like to mirror mobile.me then you'll need around 250-300 terabytes usable space.
02:25 πŸ”— dnova (for ridiculous example)
02:27 πŸ”— kim__ That's about Eur 14400 at current HDDprices
02:27 πŸ”— * kim__ scratches head. I'm not quite that rich ;-)
02:28 πŸ”— kim__ by a bit of a margin :-P
02:28 πŸ”— dnova yeah.
02:28 πŸ”— dnova Generally what happens is a few dozen of us each grab however many gb/tb we can and eventually upload it to IA for digestion
02:29 πŸ”— kim__ Okay, but IA does their own crawls too. What's the advantage to doing it this way?) (is there a webpage about that, so I can TFM and stop asking silly questions? ;)
02:30 πŸ”— dnova we we spring into action we're making a comprehensive archive of an entire site
02:30 πŸ”— kim__ http://archiveteam.org/index.php?title=Frequently_Asked_Questions <- short
02:30 πŸ”— kim__ How long does AT exist?
02:30 πŸ”— kim__ There's 99 lurkers on IRC right now, so I figure it's been a while :-)
02:31 πŸ”— dnova 3ish years
02:31 πŸ”— kim__ IA has 2 locations, 1 mirror at the new library of alexandria
02:31 πŸ”— * kim__ says, checking http://archiveteam.org/index.php?title=Fire_Drill
02:32 πŸ”— kim__ oh wait, listed. Also stating it's b0rked. That Can't Be Good (tm)
02:33 πŸ”— kim__ Ok, one thing I could propose a joint project with AT on is the recovery of WP dumps.
02:34 πŸ”— kim__ preferably including the photos. (they should be available via commons.wikimedia.org)
02:34 πŸ”— dnova hmm
02:34 πŸ”— dnova I'm not sure if wikipedia falls under wikiteam's purvey (there is a sub-group here who archives any wikimedia wiki they come across)
02:35 πŸ”— kim__ interesting. I wonder where those archives go
02:35 πŸ”— * kim__ <- wikimedia-ish vonlunteer
02:35 πŸ”— kim__ volunteer too.
02:36 πŸ”— dnova http://archiveteam.org/index.php?title=Wikiteam
02:36 πŸ”— kim__ I could have found that one ^^;;
02:36 πŸ”— kim__ But there is no image dump available, only the image descriptions
02:36 πŸ”— kim__ Okay
02:36 πŸ”— dnova yeah that's too bad
02:36 πŸ”— kim__ Is anyone from wikiteam online?
02:37 πŸ”— dnova the wikimedia foundation is a bit... ehh, not the best with these things.
02:37 πŸ”— kim__ geh, please don't tell me that
02:37 πŸ”— kim__ fortunately, there's a bit of a solution
02:37 πŸ”— kim__ I'd ask on freenode #wikimedia-tech , possibly
02:38 πŸ”— dnova I've never seen it this quiet in here when there is already some talking going on
02:38 πŸ”— kim__ WP is not the oldest wiki in the world btw. It's a young whipper-snapper
02:38 πŸ”— dnova people with their lives and what-not
02:38 πŸ”— kim__ it's 4:30 AM local
02:38 πŸ”— dnova it's evening in the US
02:38 πŸ”— kim__ on a "sunday"
02:38 πŸ”— dnova I'm pretty sure it's thursday where you are
02:39 πŸ”— dnova where are you by the way? I'm moving to your time zone soon.
02:39 πŸ”— kim__ Ascension day today
02:39 πŸ”— kim__ and lots of people are taking a long weekend besides
02:40 πŸ”— kim__ (me three!)
02:40 πŸ”— kim__ I'm in .nl
02:40 πŸ”— dnova ah
02:40 πŸ”— * dnova moving to .at
02:40 πŸ”— kim__ That'
02:40 πŸ”— kim__ s a VERY pretty country
02:41 πŸ”— kim__ where are you moving from? :-)
02:41 πŸ”— DFJustin jason scott is archiving wikimedia images as we speak http://archive.org/details/2012-04-30-wikimedia-images-snapshot
02:42 πŸ”— kim__ we have just got to be able to make that easier
02:43 πŸ”— dnova the us
02:43 πŸ”— kim__ dnova, Interesting, what part of the US? Is your part very different or very similar?
02:43 πŸ”— kim__ ;-)
02:43 πŸ”— dnova Buffalo, NY
02:43 πŸ”— kim__ The buffalo where buffalo buffalo buffalo buffalo buffalo
02:44 πŸ”— kim__ ?
02:44 πŸ”— dnova s/where// and some of those need capitalization
02:44 πŸ”— dnova if you'd like it to be grammatically correct :)
02:44 πŸ”— kim__ http://en.wikipedia.org/wiki/Buffalo_buffalo
02:44 πŸ”— * kim__ sniggers
02:45 πŸ”— chronomex kim__: the "value-add" of archiveteam is we have multiple people focusing on a single site
02:45 πŸ”— chronomex IA has like 5 people for the whole internet
02:45 πŸ”— chronomex we do deeply focused crawls
02:46 πŸ”— kim__ fair enough
02:46 πŸ”— dnova and sometimes we piss people right the hell off.
02:46 πŸ”— kim__ dnova, how so?
02:46 πŸ”— chronomex but I resent the term "value-add"
02:47 πŸ”— kim__ the famous (C) brigade? ;-)
02:47 πŸ”— dnova it turns out that people don't want you to download stuff that they put on the internet
02:47 πŸ”— * dnova shrugs
02:47 πŸ”— chronomex weird
02:47 πŸ”— kim__ chronomex, fair enough.
02:47 πŸ”— chronomex besides
02:47 πŸ”— chronomex overlap is better than gappiness
02:47 πŸ”— dnova yes
02:47 πŸ”— kim__ sounds like you folks know what you're doing.
02:47 πŸ”— chronomex you should know that wbm is gappy, if youve ever used it
02:48 πŸ”— chronomex e.g. it skips non-small files
02:48 πŸ”— kim__ wbm stands for? Oh wayback machine. And if you mean gappy in time, or gappy in ...
02:48 πŸ”— kim__ oh. That Can't Be Good.
02:48 πŸ”— chronomex gappy in all ways
02:48 πŸ”— chronomex time, breadth, depth
02:48 πŸ”— kim__ well shoot
02:48 πŸ”— DFJustin another value-add is that people write scrapers that can navigate javascripty or login-based sites like friendster or google video
02:49 πŸ”— chronomex yes
02:49 πŸ”— DFJustin IA is more set up for straightforward html sites
02:49 πŸ”— kim__ okay, well, I'd like for wmf sites to be archived properly. One can get mediawiki and database dumps
02:50 πŸ”— kim__ I'm not entirely sure how to transfer over commons though. if you want I can ask. Or do you already have contacts with wmf/volunteers/techs?
02:51 πŸ”— kim__ (in which case, I'd just be in the way)
02:51 πŸ”— dnova we can always use more warm bodies
02:51 πŸ”— dnova and in some cases warm optional
02:51 πŸ”— kim__ Grrr, Argh?
02:53 πŸ”— kim__ hmm, emijrp (might use a different nick at AT) is apparantly an AT-er
02:53 πŸ”— dnova he is emijrp here
03:00 πŸ”— kim__ are they any good with helping with wikis? ;-)
03:01 πŸ”— kim__ http://lists.wikimedia.org/pipermail/wikimedia-l/2012-May/120200.html
03:02 πŸ”— kim__ this helpful?
03:03 πŸ”— kim__ and if it's any use at any time, I've got a small server at hetzner.de and know how to use it
03:03 πŸ”— dnova it is any use
03:03 πŸ”— dnova stick around
03:05 πŸ”— kim__ (ps, the Grr Argh: http://www.youtube.com/watch?v=NCLPHSVtvmU involves warm-optional bodies ;-)
03:08 πŸ”— dnova never saw the show
03:08 πŸ”— kim__ hence the no-laughing ;-P
03:09 πŸ”— dnova sorry :P
03:09 πŸ”— kim__ you're excused :-P
03:09 πŸ”— chronomex STAND BACK
03:09 πŸ”— chronomex KIM__ HAS A SERVER AND HE KNOWS HOW TO USE IT
03:09 πŸ”— kim__ actually, no wait, you haven't seen buffy? That's inexcusable! :-P
03:10 πŸ”— yipdw^ I use all my servers as clients
03:10 πŸ”— kim__ chronomex, I know regex too https://www.xkcd.com/208/
03:10 πŸ”— kim__ yipdw, too much X11? ;-)
03:11 πŸ”— chronomex o_o_o_o
03:11 πŸ”— kim__ chronomex, Of course, now I have 2 problems. http://www.codinghorror.com/blog/2008/06/regular-expressions-now-you-have-two-problems.html
03:13 πŸ”— * kim__ figures AT to be more beautiful soup fans , though http://www.crummy.com/software/BeautifulSoup/
03:14 πŸ”— yipdw^ that or a dozen other things
03:15 πŸ”— kim__ *nod*
03:15 πŸ”— yipdw^ https://github.com/ArchiveTeam has a bunch of code, so you can check that out if you'd like
03:19 πŸ”— underscor http://a4.sphotos.ak.fbcdn.net/hphotos-ak-ash4/318277_10150947081920605_42214640604_12269339_963561368_n.jpg
03:22 πŸ”— dnova heh
03:24 πŸ”— underscor kim__: SketchCow's been doing dumps and transfers, and I've been working with Ariel Glenn at WMF and Kevin Day at your.org to get a more useable archive up and running on IA
03:24 πŸ”— underscor (if you know either of them)
03:25 πŸ”— * kim__ copy-pastes from a mail I'm about to send:
03:25 πŸ”— kim__ ==Fire Drill==
03:25 πŸ”— kim__ Has anyone recently set up a full-external-duplicate of (for instance) en.wp?
03:25 πŸ”— kim__ This includes all images, all discussions, all page history (excepting the user
03:25 πŸ”— kim__ accounts and deleted pages)
03:26 πŸ”— underscor nope
03:26 πŸ”— underscor that would be an interesting project
03:26 πŸ”— underscor lol
03:26 πŸ”— underscor oh, I just saw that it was a mail
03:26 πŸ”— kim__ Ok, then I won't make a fool of myself by sending that
03:26 πŸ”— kim__ Consider yourself (almost) volunteered then ;-)
03:27 πŸ”— kim__ or volunteerized
03:27 πŸ”— underscor haha
03:27 πŸ”— kim__ we need a word
03:28 πŸ”— yipdw^ FUCK YOU, YOU ARE ALL IN ARCHIVE TEAM
03:28 πŸ”— yipdw^ I DEPUTIZE ALL OF YOU
03:29 πŸ”— underscor hahaha
03:29 πŸ”— underscor I loved that
03:29 πŸ”— underscor I don't think he said the second, but the first was at a DEFCON talk, wasn't it?
03:29 πŸ”— kim__ yipdw^, I try that all the time. The en.wp regulars cottoned on long ago
03:30 πŸ”— yipdw^ underscor: yeah
03:30 πŸ”— kim__ http://lists.wikimedia.org/pipermail/wikimedia-l/2012-May/120203.html
03:30 πŸ”— yipdw^ kim__: that's weird, I'd expect Wikipedia people to be interested in duplication
03:30 πŸ”— kim__ yipdw, the "I deputize all of you" trick ;-)
03:30 πŸ”— kim__ And WP should indeed be interested in duplication
03:31 πŸ”— yipdw^ I need to see if CouchDB replicates attachments
03:31 πŸ”— yipdw^ I think it does
03:31 πŸ”— yipdw^ if it does, then maybe I can just dump Wikipedia in a couch
03:31 πŸ”— dnova when I see "WP" I think wordpress :/
03:31 πŸ”— dnova or write protect
03:31 πŸ”— dnova or word perfect
03:31 πŸ”— yipdw^ like
03:31 πŸ”— yipdw^ a CouchDB instance that is, uh
03:32 πŸ”— yipdw^ how big is Wikipedia, text, images, discussions, and all?
03:32 πŸ”— kim__ btw, here's the mailing list page https://lists.wikimedia.org/mailman/listinfo/wikimedia-l , in case anyone wants to sign up, answer my mail, and take ownership from the AT side ;-)
03:32 πŸ”— yipdw^ just say the English one for no
03:32 πŸ”— yipdw^ w
03:32 πŸ”— kim__ oh, I was about to say there's 800 wikimedia wikis
03:32 πŸ”— kim__ I'm not actually entirely sure anymore
03:32 πŸ”— yipdw^ yeah, let's just do English WP
03:32 πŸ”— kim__ the text is a few gigs
03:32 πŸ”— kim__ but english wikipedia links out to a different wiki called commons.wikimedia.org
03:33 πŸ”— yipdw^ right
03:33 πŸ”— kim__ whicyh contains a lot of the images and multimedia
03:33 πŸ”— yipdw^ do you know how big that is?
03:33 πŸ”— kim__ and commons is Very Very Large
03:33 πŸ”— yipdw^ hm
03:33 πŸ”— yipdw^ maybe I should do a Kickstarter
03:33 πŸ”— yipdw^ ask for $10,000 for a hundred TB
03:33 πŸ”— yipdw^ and just try to suck in all WMF sites
03:33 πŸ”— kim__ http://commons.wikimedia.org/wiki/Main_Page 12,819,893
03:34 πŸ”— underscor yipdw^: do it!
03:34 πŸ”— kim__ yipdw^, Orrrr, I convince wmf that this is essential
03:34 πŸ”— underscor money to create an "archival copy of wikipedia"
03:34 πŸ”— underscor that's synced like once a month or something
03:34 πŸ”— kim__ or you convince them
03:34 πŸ”— kim__ like "AT can do this, but we'd need storage)
03:34 πŸ”— yipdw^ or we use fuckloads of bandwidth :D
03:34 πŸ”— underscor :D
03:34 πŸ”— yipdw^ Archive Team is good at that
03:34 πŸ”— yipdw^ if nothing else
03:34 πŸ”— kim__ or All Of The Above
03:34 πŸ”— underscor 1-800-BW-SUCKR
03:35 πŸ”— dnova plz don't post my direct line
03:35 πŸ”— kim__ http://commons.wikimedia.org/wiki/Special:Statistics
03:35 πŸ”— yipdw^ huh
03:35 πŸ”— yipdw^ no size
03:35 πŸ”— yipdw^ that is a lot of pages, though
03:35 πŸ”— underscor It's 18 TB
03:36 πŸ”— yipdw^ that's it?
03:36 πŸ”— underscor for media?
03:36 πŸ”— underscor yes
03:36 πŸ”— yipdw^ huh
03:36 πŸ”— yipdw^ neat
03:36 πŸ”— underscor 204.x.x.x:/z/public/pub/wikimedia/dumps 152T 33T 118T 22% /mnt/dumps
03:36 πŸ”— underscor 204.x.x.x:/z/public/pub/wikimedia/images 136T 17T 118T 13% /mnt/images
03:36 πŸ”— underscor (I have a box with them nfs mounted)
03:36 πŸ”— kim__ underscor, heh, useful
03:37 πŸ”— yipdw^ does that also include audio and video clips?
03:37 πŸ”— kim__ underscor, buut, that's not really 100% public, is it?
03:37 πŸ”— underscor yes
03:37 πŸ”— kim__ yipdw, should do, it's the same dir
03:37 πŸ”— yipdw^ oh ok
03:37 πŸ”— underscor kim__: yeah, should be. I mean, it's what's available on ftpmirror.your.org
03:37 πŸ”— underscor yipdw^: :D
03:37 πŸ”— underscor 800-439-2978 In Disconnect 800-HEY-BWSUCKR
03:37 πŸ”— underscor 800-932-9782 In Disconnect 800-WE-BWSUCKR
03:37 πŸ”— underscor 888-225-5297 In Disconnect 888-CALL-BWSUCKR
03:37 πŸ”— underscor 888-237-8297 In Disconnect 888-BEST-BWSUCKR
03:37 πŸ”— yipdw^ awesome
03:38 πŸ”— yipdw^ Archive Team direct line
03:38 πŸ”— yipdw^ CALL THE A-TEAM
03:38 πŸ”— underscor hahaha
03:38 πŸ”— underscor I think I know what to get SketchCow for his birthday
03:38 πŸ”— kim__ underscor, so we've got all the data now we need to do the firedrill
03:38 πŸ”— yipdw^ 1-800-MAH-DICK
03:38 πŸ”— kim__ yipdw, the AT-TEAM?
03:38 πŸ”— underscor kim__: us
03:38 πŸ”— yipdw^ nah A-Team
03:38 πŸ”— underscor 800-627-2448 In Disconnect 800-6-ARCHIVETEAM
03:38 πŸ”— underscor 866-727-2448 In Disconnect 866-7-ARCHIVETEAM
03:38 πŸ”— underscor 866-825-8327 In Disconnect 866-VALUE-ARCHIVETEAM
03:38 πŸ”— underscor 877-367-2724 In Disconnect 877-FOR-ARCHIVETEAM
03:38 πŸ”— underscor 888-665-9272 In Disconnect 888-ONLY-ARCHIVETEAM
03:38 πŸ”— underscor 800-743-2724 800-743-ARCHIVETEAM
03:38 πŸ”— underscor ha
03:39 πŸ”— * kim__ plays the theme http://www.youtube.com/watch?v=_MVonyVSQoM
03:39 πŸ”— kim__ we need a new intro blurb though
03:42 πŸ”— underscor Lol
03:43 πŸ”— kim__ http://meta.wikimedia.org/wiki/Data_dump_torrents
03:43 πŸ”— kim__ Also useful
03:44 πŸ”— kim__ you probably already knew that one
03:44 πŸ”— kim__ Anyway, can I recruit some of you fine folks for fire-drill kind of things?
03:44 πŸ”— kim__ it would leave you with a fully functional archival copy of $wmf-wiki and all
03:45 πŸ”— yipdw^ sounds like fun, but I only have a couple of terabytes free on my personal machines
03:45 πŸ”— kim__ so I'd need to find some TBs?
03:45 πŸ”— yipdw^ I think underscor has TB out his as
03:45 πŸ”— yipdw^ s
03:46 πŸ”— kim__ does he loan them to you? ;-)
03:46 πŸ”— yipdw^ I don't want tuberculosis
03:46 πŸ”— kim__ :-p
03:49 πŸ”— yipdw^ though, you did give me an idea
03:49 πŸ”— yipdw^ I have been on a CouchDB kick for a while, mostly because the replication system is so damn smooth nowadays
03:50 πŸ”— yipdw^ so I think I'll just load the text of en.wp and see how that works out
03:50 πŸ”— mistym Hey, what's the status on that radio site SketchCow was posting earlier today? Is someone on that?
03:50 πŸ”— yipdw^ see if I can reconstruct a hyperlinked, textual WP from that
03:50 πŸ”— kim__ right. That *is* interesting. But I was wondering if it was possible to recreate a Fully Operational Battlest^W I mean copy of wikipedia
03:51 πŸ”— yipdw^ yeah, that was my second stage plan
03:51 πŸ”— underscor yipdw^: :D
03:51 πŸ”— yipdw^ incorporate all multimedia as CouchDB document attachments
03:51 πŸ”— yipdw^ at that point, if you want a copy of Wikipedia, you replicate the DB
03:51 πŸ”— kim__ and then write back out?
03:51 πŸ”— yipdw^ and the logic needed to render that data out goes with it
03:51 πŸ”— yipdw^ if you store it in e.g. CouchDB design documents
03:52 πŸ”— yipdw^ I mean, yes, it is a shitload of data, and you will need a corresponding shitload of throughput
03:52 πŸ”— kim__ The mediawiki engine is fully open source, right? :-)
03:52 πŸ”— yipdw^ there is no US consumer-grade ISP package that will let you do ths
03:52 πŸ”— kim__ yipdw^, how is this a problem?
03:52 πŸ”— yipdw^ it isn'
03:52 πŸ”— yipdw^ t
03:52 πŸ”— yipdw^ I'm just saying
03:53 πŸ”— kim__ *nod*
03:53 πŸ”— yipdw^ and yes, mediawiki is open source
03:54 πŸ”— yipdw^ ha
03:54 πŸ”— yipdw^ http://en.wikipedia.org/wiki/Wikipedia:Database_download#English-language_Wikipedia
03:55 πŸ”— yipdw^ I like how "multiple terabytes" is a link to en.wp/Terabyte
03:55 πŸ”— kim__ Currently Wikipedia does not allow or provide facilities to download all images. As of 17 May 2007, Wikipedia disabled or neglected all viable bulk downloads of images including torrent trackers. Therefore, there is no way to download image dumps other than scraping Wikipedia pages up or using Wikix, which converts a database dump into a series of scripts to fetch the images.
03:55 πŸ”— yipdw^ LIES
03:56 πŸ”— kim__ well, ICK
03:56 πŸ”— kim__ yipdw, this is fixed?
03:56 πŸ”— yipdw^ no
03:56 πŸ”— underscor it is fixed now
03:56 πŸ”— yipdw^ I just stopped reading at 'viable'
03:56 πŸ”— underscor ftpmirror.your.org has a copy
03:56 πŸ”— yipdw^ and was like 'well, we could just scrape everything'
03:56 πŸ”— kim__ underscor, that's something
03:56 πŸ”— yipdw^ then I realized that they permitted that option
03:59 πŸ”— yipdw^ http://dumps.wikimedia.org/enwiki/20120502/
03:59 πŸ”— yipdw^ huh
03:59 πŸ”— yipdw^ why are the 7z metahistory dumps so much smaller than the bz2s?
03:59 πŸ”— yipdw^ is 7z actually just that good?
03:59 πŸ”— yipdw^ or are the 7z dumps broken
03:59 πŸ”— underscor it's that good
03:59 πŸ”— underscor well, at least for the set before that
04:00 πŸ”— underscor I haven't checked 5/2's
04:00 πŸ”— underscor but it's the same ratio-ish with the previous set
04:00 πŸ”— yipdw^ damn
04:00 πŸ”— kim__ ok, I just found me a new compressor O:-)
04:00 πŸ”— yipdw^ I've known about 7zip for a while, but never really used it regularly
04:01 πŸ”— yipdw^ I am wondering how it manages to kick the shit out of bzip2 like this
04:01 πŸ”— underscor ugh
04:01 πŸ”— underscor I have a networking proposal to finish tonight
04:02 πŸ”— underscor and I don't want to work on it >:I
04:02 πŸ”— underscor I need motivation :'(
04:02 πŸ”— yipdw^ what is the proposal?
04:02 πŸ”— underscor it's the final project of our CCNA class
04:02 πŸ”— underscor we have to propose the network installation for a company that just bought a new office space
04:03 πŸ”— yipdw^ lots of WRT54Gs
04:03 πŸ”— underscor everything, from PCs and printer models, to what ISPs we'll peer with, where in the building WAPs will go
04:03 πŸ”— underscor everything
04:03 πŸ”— underscor hahahahahahaha
04:03 πŸ”— yipdw^ you know that that is the way most companies start
04:03 πŸ”— underscor that's what they have
04:03 πŸ”— yipdw^ or at least most companies that don't have an overbearing IT staff
04:03 πŸ”— underscor two 2960 switches
04:03 πŸ”— underscor and 8 wrt54gs
04:04 πŸ”— underscor we have to fix it according to Cisco Best Practices (tm)
04:04 πŸ”— yipdw^ propose a layout that uses equipment that is not produced by Cisco
04:04 πŸ”— underscor I did, originally
04:04 πŸ”— yipdw^ heh
04:04 πŸ”— underscor I was told that was not allowed.
04:04 πŸ”— yipdw^ what
04:04 πŸ”— underscor It's a cisco networking academy course
04:04 πŸ”— underscor the judges are all from cisco
04:04 πŸ”— underscor so
04:05 πŸ”— underscor They like to pretend Juniper, force10, et al. don't exist
04:05 πŸ”— yipdw^ that seems to be doing them favors in the market
04:05 πŸ”— kim__ alright, I'm off to bed
04:05 πŸ”— kim__ the day star is rising already
04:05 πŸ”— yipdw^ 'night
04:05 πŸ”— kim__ 'day! ;-)
04:05 πŸ”— underscor ha
04:10 πŸ”— underscor Unable to Add an Account: You are only allowed to be signed into 8 accounts simultaneously
04:10 πŸ”— underscor I guess that's google telling me I have too many google profiles
04:11 πŸ”— dnova you're really wearing their resources thin
04:11 πŸ”— underscor lol
04:12 πŸ”— underscor I have like 17, between my personal, work, school, other school, third school, other work, other other work
04:12 πŸ”— underscor lol
04:13 πŸ”— dnova and they all have the same 4 digit numeric password
04:13 πŸ”— yipdw^ 5 digi
04:13 πŸ”— yipdw^ t
04:13 πŸ”— underscor actually, no, they're all random alphanumeric strings
04:13 πŸ”— yipdw^ underscor is security-conscious
04:13 πŸ”— underscor 12 characters
04:13 πŸ”— yipdw^ why do you have 17 google profiles?
04:14 πŸ”— yipdw^ that seems excesssive, even in that acse
04:14 πŸ”— underscor it's actually 14
04:14 πŸ”— underscor I hyperbolized
04:14 πŸ”— mistym Oh, *only* 14.
04:14 πŸ”— yipdw^ why do you have 14 google profiles?
04:14 πŸ”— yipdw^ that seems excesssive, even in that case
04:14 πŸ”— underscor uh
04:14 πŸ”— underscor I don't know
04:14 πŸ”— dnova it does seem excessive but it's not improbable these days
04:14 πŸ”— dnova everyone using google apps
04:15 πŸ”— underscor yeah
04:15 πŸ”— underscor that
04:15 πŸ”— underscor I don't regularly USE all of them
04:15 πŸ”— underscor these are just sessions that slowly accumulate
04:15 πŸ”— underscor since I haven't logged off on here for like 3 or 4 weeks
04:16 πŸ”— dnova if only you could combine them and gather up all the storage
04:16 πŸ”— yipdw^ I guess that makes sense
04:16 πŸ”— underscor It's not bad, I have 4x10GB, and the remaining 10 are 25GB
04:16 πŸ”— yipdw^ You are using 1% of your 290GB
04:16 πŸ”— dnova was there ever any official statement about their jump from 7.xgb to 10.xgb?
04:16 πŸ”— dnova I noticed it one day but never saw anything about it
04:16 πŸ”— underscor that's a lot of email :o
04:17 πŸ”— yipdw^ I just use my gmail account as a spam trap
04:17 πŸ”— underscor I don't think so
04:17 πŸ”— dnova they went shopping, got some hard drives
04:17 πŸ”— dnova shared the space with everyone
04:17 πŸ”— underscor my friend was laughing because he had 40 mb left, and was going to have to purchase space
04:17 πŸ”— underscor then they changed it
04:17 πŸ”— dnova my friend is PISSED because he JUST bought a year of storage upgrade
04:18 πŸ”— dnova but it's like $50 so I laugh at him
04:18 πŸ”— underscor haha
04:18 πŸ”— dnova google recently got me, too
04:18 πŸ”— underscor teacher sends me a screenshot
04:19 πŸ”— underscor 11MB bmp
04:19 πŸ”— dnova I had to make a call to austria and had the free 10 cents gvoice credit, but figured 5 minutes wasn't going to be enough, so I paid $10 to add more
04:19 πŸ”— underscor ugh
04:19 πŸ”— dnova then I only used 4 minutes
04:19 πŸ”— dnova so now I have $10.02
04:19 πŸ”— underscor aww
04:19 πŸ”— underscor lol
04:19 πŸ”— yipdw^ call someone random in Austria
04:19 πŸ”— dnova haha
04:19 πŸ”— dnova the credit will get used eventually I'm sure
04:19 πŸ”— underscor do they still have free us calls?
04:19 πŸ”— dnova yep
04:20 πŸ”— underscor I should get a headset for this machine, so I can call from it
04:20 πŸ”— dnova I use it a lot
04:20 πŸ”— underscor when I need to order chinese food or something
04:20 πŸ”— yipdw^ I use a phone, you hipster bastards
04:20 πŸ”— underscor :D
04:20 πŸ”— dnova I doo too, but it's tied to gvoice also
04:20 πŸ”— dnova s/doo/do/
04:29 πŸ”— yipdw^ oh, damn
04:29 πŸ”— yipdw^ 40091648 enwiki-20120502-pages-meta-history1.xml-p000000010p000002979 268240 enwiki-20120502-pages-meta-history1.xml-p000000010p000002979.7z
04:29 πŸ”— yipdw^ ls -1s
04:29 πŸ”— yipdw^ total 46308376
04:29 πŸ”— yipdw^ er
04:29 πŸ”— yipdw^ oops
04:29 πŸ”— yipdw^ well, anyway, yeah, that's like two orders of magnitude compression
05:01 πŸ”— Zebranky SketchCow: Project funded. Only three weeks left!
05:52 πŸ”— ariana hi all
05:52 πŸ”— mdupont there is a problem with the mail server
05:53 πŸ”— mdupont archiveteam@archiveteam.org is not working
05:59 πŸ”— dnova good. fucking. lord.
06:00 πŸ”— dnova also, hi
06:15 πŸ”— mdupont hi dnova
06:16 πŸ”— mdupont also the register new user page does not work
06:16 πŸ”— dnova on the wiki?
06:17 πŸ”— dnova that was a thing I thought it got fixed
06:19 πŸ”— mdupont dnova, i ran into it today
06:19 πŸ”— mdupont and then i could not even report it
06:19 πŸ”— dnova well you've done so in here
06:29 πŸ”— mdupont :D
06:49 πŸ”— SketchCow Excellent, Zebranky
06:49 πŸ”— SketchCow And I guess I have things to fix!!
06:51 πŸ”— Coderjoe yipdw: 7zip's secret is essentially LZMA: it uses an LZ77, but has a larger window, uses markov chains, and an arithmetic encoder. and several different chains. it is rather crazy.
06:52 πŸ”— Coderjoe yipdw: whereas bzip2 operates on however much data fits in a 900000 byte RLE block per output block.
06:55 πŸ”— Coderjoe s/arithmetic/range/
06:55 πŸ”— Coderjoe (which are close to identical concepts)
06:58 πŸ”— Coderjoe plus, it adapts to the data continuously, while bzip2 is per block and deflate (gzip) is just also per block (but the block length is arbitaray)
06:58 πŸ”— yipdw oh
06:58 πŸ”— yipdw that'd explain it
06:58 πŸ”— yipdw I noticed a shitload of redundancy in the uncompressed XML
06:59 πŸ”— yipdw that and the decompression took forever
07:11 πŸ”— Coderjoe yeah, LZ77-based algorithms love redundancy
07:12 πŸ”— mdupont so i have been just reading about the firedrill and the wikipedia
07:12 πŸ”— mdupont anyone have access to the binlogs of the wp mysql?
07:13 πŸ”— yipdw probably not here, but I'm not sure why they're necessary
07:31 πŸ”— * SmileyG ponders if you want the history for the changes and if they are stored there
07:32 πŸ”— mdupont SmileyG, yipdw i would like to get the "not notable" articles
07:32 πŸ”— mdupont for some place and topics, almost everything is not notable
07:32 πŸ”— mdupont and alot of good stuff is deleted
07:34 πŸ”— SmileyG yah sux
07:34 πŸ”— SmileyG thats why Idon't contribute to wikipedia :/
07:34 πŸ”— SmileyG Something which no one knows about, never gets to be known about.
07:35 πŸ”— mdupont yes
07:36 πŸ”— mdupont so i think the binlogs would be good for catching deletions
07:36 πŸ”— mdupont and supposedly they are on the toolserver, i will have to go searching
07:46 πŸ”— SmileyG ah
08:05 πŸ”— mdupont https://wiki.toolserver.org/view/Database_access
08:05 πŸ”— mdupont it looks like the toolserver is a replica already
08:12 πŸ”— SmileyG lolk
08:29 πŸ”— alard I think the end of MobileMe is in sight: I've now searched with English, Spanish, French, German, Dutch and finally Italian dictionaries. (The 350,000 Italian words produced only 400 more items.)
08:29 πŸ”— alard Unless I've missed a really important language I'll stop searching for more. There are 21,000 items left on the to do list.
08:32 πŸ”— ersi alard: Good job :]
08:47 πŸ”— mdupont also people, i am working on putting the osm/fosm.org cc data on archive.org http://osmopenlayers.blogspot.de/2012/05/s3-buckets-for-fosm-in-progress.html
08:58 πŸ”— SmileyG hmmm how many more items to grab alard ? (I don't have the url handy to check :P )
08:59 πŸ”— ersi SmileyG: 10:30 <@alard> Unless I've missed a really important language I'll stop searching for more. There are 21,000 items left on the to do list.
08:59 πŸ”— ersi Try reading the last part
09:02 πŸ”— alard SmileyG: http://memac.heroku.com/
09:08 πŸ”— SmileyG :D
09:08 πŸ”— SmileyG I thought he grabbed 21000 more items ¬_¬
09:09 πŸ”— ersi Oh, heh
09:10 πŸ”— * SmileyG fails reading when tired :(
09:11 πŸ”— ersi ^_^
09:14 πŸ”— SmileyG and its 10:15am :(
09:16 πŸ”— schbiridi SmileyG: did you finish your fileplanet chunk? :>
09:16 πŸ”— schbiridi i would just re-grab it otherwise, no biggie
09:17 πŸ”— SmileyG i never go one....
09:17 πŸ”— SmileyG time got the better of me :(
09:17 πŸ”— schbiridi ah, ok
09:17 πŸ”— schbiridi np
09:51 πŸ”— chronomex Coderjoe, yipdw: mediawiki xml -> 7zip gets about the same size reduction as mediawiki -> RCS file, no compression
10:19 πŸ”— Schbirid http://archive.org/post/419916/old-friendster-blog
11:34 πŸ”— godane geting some techtv stuff from news groups
11:35 πŸ”— godane i hope to find music wars special on newsgroups
11:56 πŸ”— visitro hi there ! I would like to contribute to project Gutenberg. I want to digitalize a book containing drawings, figures, ... what kind of file format should I use for this book to be compliant with your guidelines ? Where should I submit it for approval ?
11:59 πŸ”— ersi There's nothing like approval, or guidelines
12:00 πŸ”— ersi and this isn't #gutenberg :P But feel free to digitalize the book anyhow, as large scanning resolution as possible with open formats.. should do fine
12:01 πŸ”— visitro I got the link here http://archiveteam.org/index.php?title=Project_Gutenberg :)
12:03 πŸ”— ersi hm~
12:04 πŸ”— ersi well, we're not gutenberg anyhow :)
12:04 πŸ”— ersi we want to download and archive everything from gutenberg
12:04 πŸ”— visitro and for the kind of book I'm speaking about, using LaTeX should be the best way to do, but that will break compatibility with ePub and other kindle file formats, no ?
12:04 πŸ”— visitro humm ok
12:06 πŸ”— visitro and what's your goal ? I mean, what's the point of copying gutenberg.org ?
12:06 πŸ”— ersi we like to store things, we're digital pack rats
12:06 πŸ”— ersi Copying is good. Copying makes things possibly last longer and be around longer
12:07 πŸ”— Schbirid visitro: this is http://archiveteam.org/ .
12:08 πŸ”— ersi Trying to make sure things don't disappear into the vast nothingness
12:08 πŸ”— visitro haha ok nice :)
12:08 πŸ”— visitro then, sorry for the noise :) I fear there is nothink like an official gutenberg irc chanel
12:08 πŸ”— visitro hava nice day ;)
12:09 πŸ”— alard visitro: Maybe you should check with the project gutenberg digital proofreaders project.
12:09 πŸ”— alard I think they have a wiki full of tips about scanning etc.
12:10 πŸ”— alard http://www.pgdp.net/ (digital proofreaders should be distributed proofreaders, obviously)
13:46 πŸ”— godane i'm trying to find this file:  TECHTV The Screen Savers Cable in The Classroom - December 2001.avi
13:46 πŸ”— godane there are torrents but no one is seeding
13:46 πŸ”— godane i was hoping to find it on newsbin
15:43 πŸ”— SketchCow Morning.
15:44 πŸ”— SketchCow I've got the mobileme transfers pretty sewn up right now, so that's going well, again.
15:48 πŸ”— balrog_ morning, SketchCow
16:36 πŸ”— SketchCow Trying to fix the mail issues.
16:36 πŸ”— SketchCow Updating from Mon Feb 14 18:03:51 EST 2011 to Thu May 17 12:20:32 EDT 2012.
16:36 πŸ”— SketchCow WELL GUESS IT'S BEEN A WHILE FOR THAT SERVER HUH
16:51 πŸ”— mistym Hey SketchCow, what's the haps with that radio site you posted yesterday? Is someone on that?
16:53 πŸ”— godane SketchCow: uploading a lost epiosde of the screen savers
16:54 πŸ”— SketchCow I don't know
16:54 πŸ”— SketchCow Archive.org is grabbing a copy as we speak, I didn't see anyone else here indicate they'd be don.
16:57 πŸ”— balrog_ one vt100 manual == 10gb of scans (before processing)
16:59 πŸ”— SketchCow Broome County Sheriff's Deputies say a tractor trailer hauling Chobani Yogurt got on the ramp to Interstate 81 too fast. When it rounded a curve, the trailer slid over the embankment and spilled 36,000 pounds of yogurt on the shoulder and down the hillside.
17:02 πŸ”— chronomex *slime*
17:04 πŸ”— SketchCow Applying patches... done.
17:04 πŸ”— SketchCow Fetching 22713 new ports or files...
17:04 πŸ”— SketchCow Oh yeah, this is going to be quite the fixup.
17:12 πŸ”— godane whats a good newsgroup search index?
17:13 πŸ”— godane having a hard time finding techtv in classroom
17:20 πŸ”— yipdw http://www.businessweek.com/articles/2012-05-16/is-google-plus-a-ghost-town-and-does-it-matter
17:20 πŸ”— yipdw google+ archive time
17:28 πŸ”— godane best to start the archive now
17:51 πŸ”— SmileyG Hmmm
17:52 πŸ”— SmileyG G+ is a CDN
17:52 πŸ”— SmileyG I read _lots_ of posts, all day long infact
17:52 πŸ”— SmileyG yet I post... ~1once a week
17:52 πŸ”— SmileyG the problem is the metric is measured by how much people post....
17:52 πŸ”— SmileyG because you know, twitter isn't anything unless every single of the X members if posting constantly like zomg?
17:53 πŸ”— aggro I know I'm on the dark side with G+ here... but I love its interface and the ability to keep up with different groups and the like.
17:53 πŸ”— SmileyG agreed
17:53 πŸ”— SmileyG To put things in perspective, he points to a recent Lady Gaga post that received 570 Γ’Β€Βœ+1sҀ on Google+. The exact same post on Facebook got 133,539 Γ’Β€ΒœLikes.Ҁ
17:53 πŸ”— SmileyG *I* Have got +15 on a post bashing anon, on their own thread.
17:53 πŸ”— SmileyG :D
17:53 πŸ”— * SmileyG is 1/10th as popular as lady gaga now?
17:54 πŸ”— Schbirid well, maybe the g+ audience is not interested in lady gaga
17:54 πŸ”— Schbirid SmileyGaga
17:54 πŸ”— SmileyG oh wait, that said 570, not 150, but you know what I mean ;)
17:54 πŸ”— aggro P P P Popular P P Popular
17:54 πŸ”— SmileyG Schbirid: damnit they figured me out ¬_¬
17:54 πŸ”— Schbirid ha
17:54 πŸ”— Schbirid you look like a horse!
17:54 πŸ”— SmileyG .o_O I do?
17:56 πŸ”— Schbirid :)
18:34 πŸ”— SketchCow I'm heading down to NYC to see a movie and do things later today. Anybody need anything?
18:34 πŸ”— SketchCow IUMA's going well, we have disk space for the remainder of mobileme, etc.
18:34 πŸ”— Schbirid and fileplanet breached it's first terabyte today
18:38 πŸ”— Schbirid http://i.imgur.com/Dp1AM.png
18:38 πŸ”— Schbirid ignore the title
18:38 πŸ”— Schbirid green is done
18:40 πŸ”— DFJustin damn that's more data than cdbbsarchive
18:40 πŸ”— SketchCow For now.
18:41 πŸ”— Schbirid including a fantastically useless encrypted 8GB unreal tournament 3 installer
18:41 πŸ”— SketchCow Like, I have a 303gb pack of CD-ROM images waiting to go
18:41 πŸ”— Schbirid cute!
18:45 πŸ”— Nemo_ter :D
18:46 πŸ”— Schbirid http://www.tested.com/news/44376-16_bit-time-capsule-how-emulator-bsnes-makes-a-case-for-software-preservation/
19:21 πŸ”— nitro2k01 Ah, didn't expect a link to bsnes here
19:21 πŸ”— nitro2k01 That guy has the attitude that all optimizations are evil
19:22 πŸ”— nitro2k01 Or rather, he wants the source code to be readable. He's aiming for BSNES to be a reference implementation and a documentation of the hardware
19:23 πŸ”— Schbirid and that is fantastic!
19:23 πŸ”— mistym nitro2k01: pretty significant difference between "speedhacks" and "optimizations" ;)
19:24 πŸ”— SketchCow e's cute.
19:24 πŸ”— SketchCow I like the little shoutout
19:24 πŸ”— mistym I figured you were well known enough by now they'd have used your name.
19:53 πŸ”— shaqfu Pity some systems are nigh-impossible to accurately emulate, barring throwing some serious brainpower at it
19:53 πŸ”— shaqfu Saturn and its eight processors...
20:33 πŸ”— mistym shaqfu: Yeah, the Saturn is eccentric for sure.
20:37 πŸ”— SmileyG 8 o_O
20:39 πŸ”— SmileyG schbiridi: your inspiring me, maybe one day....
21:01 πŸ”— nitro2k01 Hahaha! http://pouet.net/topic.php?which=4792#c174835
21:01 πŸ”— nitro2k01 "Jason Scott once replaced some picture people ripped from his site to use on MySpace with Goatse.
21:01 πŸ”— nitro2k01 So he is ok in my book."
21:13 πŸ”— SmileyG LOL
21:35 πŸ”— godane uploaded it: http://archive.org/details/TechTVSept2002
21:35 πŸ”— godane :-D

irclogger-viewer