[04:34] woah [04:34] http://btjunkie.org/goodbye.html [04:34] 2005 - 2012 [04:34] This is the end of the line my friends. The decision does not come easy, but we've decided to voluntarily shut down. We've been fighting for years for your right to communicate, but it's time to move on. It's been an experience of a lifetime, we wish you all the best! [04:58] http://tech.pnosker.com/2012/02/05/btjunkie-shuts-down-voluntarily/ [04:58] Wow. [05:00] this megaupload thing is having a crazy ripple effect [05:00] As it would be expected. [05:07] i was likening this megaupload stuff to the piratebay trial, but i guess it's quite different since the piratebay trial was at least all in sweden, but this is basically extraordinary rendition to the US. some world copyright police level stuff [05:15] Right. [05:16] US is world police [05:16] ;) [05:29] America, fuck yeah [05:53] Taxes are gonna hurrrrrrrrrrrrrrrt this year. [05:53] Everyone find me couhes [05:53] couches [05:53] oh I bet they will :| [05:53] Also, the advantages of this netbook are somewhat mitigated by how unpleasant it is to deal with. [05:53] Tiny keyboard, screen, and gmail makes it go HURRRRRRRRRRRRRRRRRR for seconds at a time. [05:54] wrrrrr [06:17] http://t.co/hRTpgpwx - 270,000 images from an imageboard that went down in 2009. [06:17] Saved! Thanks, DFJustin [06:20] http://tracker.archive.org/ [06:20] Wheee [06:21] there's actually a fair amount more where that came from, I now have like 70gb worth of konachan.com, 60gb of minitokyo.net, etc [06:23] but that's the only one that's actually straight up gone [06:23] DFJustin spends all his time on chans [06:23] 8D [06:24] (゚∀゚) [06:24] hahaha [06:24] but it's okay because he's `achiving` it all [06:25] u18chan should be archived [06:25] lolololol [06:25] I'm up for all sorts of chan archiving. [06:25] someone not in canada can do that shit :P [06:26] u18chan is nsfw, btw, as a forewarning [06:26] SketchCow: What is the archive's position on archiving porn chans? [06:27] ha ha position [06:27] reverse cowgirl [06:27] best position [06:27] lol [06:28] the konachan rips were all safely uploaded on...megaupload, luckily the torrent is still active [06:34] Poor Megaupload. [06:34] I'd like to know how the megaupload recovery site is working out. [06:36] DFJustin: CPM CD-ROMs going in [06:37] yay [06:37] the cd that doom guy uploaded at http://www.archive.org/details/D1000 seems to still be non-public [06:38] http://www.archive.org/details/cdrom-1994-11-walnutcreek-cpm&reCache=1 [06:39] http://www.archive.org/details/D1000 [06:41] i got some linux format dvds [06:41] i found some of the linux-format pdfs [06:43] wow that walnutcreek disk is wild. always read about those guys. [06:45] underscor: how long should those tests take? It's failing to change to `works!` text on chrome 18 dev [06:46] Aranje: Like 2 seconds max [06:46] hmm [06:46] def givin me shit [06:46] Which is, coincidentally, underscor's nickname in bed [06:47] I'd call that convenient more than anything else [06:47] SketchCow: Fuck you :D [06:47] * SketchCow drives around town with the girl you love [06:48] http://www.archive.org/details/cdrom-oakcpm-1999-cdrom by the way, DFJustin [06:48] So both those are in [06:48] Just working back the backlog [06:51] Also, this week we're going to begin moving off batcave to the new machine. [06:52] the fortress of solitude? [06:53] I was thinking jokerslair [06:56] nick [06:56] nice* [07:00] 2 seconds max [07:12] whats the best set of commands to issue to Wget to crawl something like fortunecity on a windows box? [07:12] crazy idea i know. [07:13] Stick with doing it on Linux. or a unix variant. [07:14] kin37ik: so you just want a page list, not to do anything? [07:14] or rather, keep dled stuff [07:16] i want it to keep DL'ed stuff [07:17] @sketch: i would but my linux box is fried, and im waiting on a new board [07:20] kin37ik: that would be more than crawling then [07:20] arrith: okay [07:20] kin37ik: could setup a linux dualboot and/or linux vm [07:20] you can boot a linux partiton that you also dualboot to [07:21] soemthing like wgat-warc -r something something [07:22] arrith: i have thought about dual booting a couple of times, but i thought if i have a dedicated linux box, then there wasnt much of a point, the board should be here this week [07:23] kin37ik: ah yeah, just depends on how soon you want to get started [07:23] Going "I'm all out of water bottles, so I'd like to drink donkey urine" is just not a question worth answering. [07:23] Just wait until the board is back, we'll wait. [07:24] also, uncontrolled wget -r on something like fortunecity is a bad idea [07:24] it is likely that you will end up with (1) a ton of stuff and (2) nothing that you want [07:24] SketchCow: fair call [07:24] controlling wget by pointing it at specific URLs and using more controlled forms of recursive retrieval, like --page-requisites, is much better [07:27] yipdw: which is what i intend to be doing first off, as ive found out alot of old pages from back when fortunecity first started are still on the servers untouched for a long time [07:28] I'd just work with other team members to find these pages. [07:28] Remember also we want WARC formats, too. [07:28] SketchCow: WARC? [07:29] kin37ik: google wget-warc [07:30] kin37ik: Web ARChive; it's a way to record not only response bodies, but also the headers associated with that body, as well as request bodies [07:30] and headers [07:31] kin37ik: http://bibnum.bnf.fr/WARC/warc_ISO_DIS_28500.pdf [07:31] WARC can also store information about the retrieving tool, retriever, etc -- all information that you want when building an archive [07:31] yipdw: aaaahhh i see, nifty [07:32] more immediately useful, though, is that WARC is standard and there exist tools to read and present it [07:32] e.g. Internet Archive's Wayback Machine, the stuff tef builds for Hanzo Archives [07:33] hmmm [07:33] interesting [08:04] woah, just been a 4 car pile up just around the corner [09:07] eek [11:21] i think i have a local mirror of defcon.org [11:21] its only 1.8gb [11:21] i think [11:46] sigh, just deleted 30 GiB of incomplete mobileme profiles [12:13] got to love defcon-6 website [12:13] all pictures are gone and was not hosted on defcon.org [12:53] HEY SO NERD RESEARCH AND QUESTION [12:53] I was told about this: http://git-annex.branchable.com/ [12:53] Anyone want to look at it? It's making waves. [13:07] In theory, reading up on it, we could create an archiveteam GIT hub that spans ALL of archive.org's holding of archiveteam stuff, our other collections, you name it. [13:28] imagine a torrent with all the btjunky torrents [13:32] tiem to archive isohunt, the pirate bay and friends ? [13:47] Yes [13:47] Yes, it was a while ago. [13:47] I assumed someone was on that already [13:52] Was http://www.publicbt.com/ archived, to start with? [14:03] Nemo_bis: there's a link right there to download their database [14:03] Ymgve, that's what i'm saying :) [14:04] ut infohash != magnet [14:04] I don't think that's the whole DB though [14:05] emijrp, do you want magnet links? [14:05] did btjunkie have them? I don't remember [14:06] i mean, whatis the point of that publicbt database? it doesnt contain magnets nor torrents [14:06] umm [14:07] and what's a value of a torrents database? it doesn't containg magnets nor seeders [14:07] the hash is basically all you need for magnet [14:08] Ymgve: ok [14:08] starting from that list of hashes you should be able to produce/download everything [14:08] that's what btjunkie itself did [14:08] just prepend magnet:?xt=urn:btih: to your infohash and you got a magnet URL [14:15] $ wc -l all.txt [14:15] 2907061 all.txt [14:15] isohunt claims 8,427,266 torrents [14:17] and TPB only 4.297.583 [14:33] ONLY. [14:34] crowdsource the download of every torrent ever [14:50] Think about the day seeding all those torrents is like sharing .txt in TEXTFILES. [15:00] Yay, thousands of dead torrents [15:00] that'll be awesome [15:16] SketchCow: I know something about git-annex :) [15:16] (since I wrote it) [15:17] SketchCow: [08:07:26] In theory, reading up on it, we could create an archiveteam GIT hub that spans ALL of archive.org's holding of archiveteam stuff, our other collections, you name it. [15:17] yep, it's doable [16:05] You know nothing [16:05] Get out of the way while the experts work on it [16:05] Actually, it's kind of a strange idea. [16:06] What's it use to verify? Not MD5 hashes, right? [16:06] Also, word's come down. Torrents. Let's get all of them. ALL. [16:19] I came. [16:20] take down *all* the torrents [16:20] (insert meme image) [16:21] http://knowyourmeme.com/memes/x-all-the-y [16:57] bittorent uses SHA1, I think [17:00] I made a script to download TPB. DO WANT? [17:04] nah, got my own [18:06] SketchCow: git-annex uses sha512 hashes by default, but can use any of the decent hashes [18:06] er, sha256 actually [18:18] I keep all my data in a git annex repo that spans many drives etc. I can run stats like this on my netbook: [18:18] local annex keys: 9 [18:18] known annex keys: 41578 [18:18] known annex size: 7 terabytes [18:18] local annex size: 952 megabytes [21:31] i have a full backup of defcon website [21:31] just the defcon.org part [21:32] but thats about 3gb and most images from 1-18 are there [21:32] also i found out that the audio for defcon 12 doesn't exist anymore