[00:07] <_shane_> Did the file command against the file file www.geocities.com and discovered it's a tar file :) [00:07] <_shane_> I'm now on my way... [00:07] :) [01:09] Yahoo Video continues its devastating death march into archive.org [01:09] I am still working on #jsmess [01:13] wow, came across a reference to an obscure, 100-year old book in my genealogy research, go to archive.org, BAM full pdf http://www.archive.org/details/recordlambton00beeruoft [05:53] DFJustin: 's good shit innit [05:55] nice DFJustin :) [06:14] lol SketchCow, you're a crazy motherfucker :) [06:15] how much space does archive.org occupy atm? it sounds like a lot of TBs are going up pretty much all the time [06:45] There was an article somewhere that said archive.org loses two or three hard drives a day [06:46] Archive.org is in the top single digits of petabytes. [06:50] That's a lotta terabytes. [06:50] Over 9000? :P [07:15] maaaaybe [07:18] Wow, so secretive [13:53] anyone have experience with Catweasel cards? [13:54] someone (awesome) gave me a minty fresh amiga 2000 and naturally i need to make some disks [14:08] as I understand it kryoflux is better than catweasel [14:13] excellent, ill look into that [14:13] thanks [14:16] Moooorning [14:16] Kryoflux is much better than catweasel. Period. [14:17] do you gentlemen have any interest in "scene" pirated amiga games, etc [14:18] all legal/piracy concerns aside, are these early loaders/cracks (sweet music included) of interest? i find them fascinating :) [14:19] Everything is interesting [14:20] <_shane_> Sold my Catweasel very unimpressed. each firmware upgrade lost a feature i needed and it could never read write a c64 disk only work with images. [14:22] I think it's so cute how you say "all legal/piracy concerns aside" [14:22] I think you need to stop saying that. [14:23] I think you need to concentrate on acquiring material before it disappears forever. [14:23] Also, I can save you time. [14:24] When you say "Are these early ___digital_items____ of interest", you're asking the wrong question. [14:25] The question is actually "I have ___size_and_quantity___ of early ___digital_items___ available - where do I put them?" [14:28] understood, thanks :) [14:29] Now, ask your question. [14:30] im going to make an attempt to image a large stack of amiga disks, of mostly pirated games [14:30] of the sample i tested last night, a majority of the disks still work [14:31] After you buy a kryoflux. [14:31] Be sure to buy a TEAC drive too [14:31] FD-235-ish? [14:34] i've got sony's, alps, and epson drives kicking around, i need to look into this more [14:45] http://forum.kryoflux.com/viewtopic.php?f=3&t=4 is a list [15:49] http://code.google.com/apis/sidewiki/docs/2.0/reference_guide.html#Feeds [15:49] Well now. [15:52] Is Sidewiki closing? [15:53] Yes [15:53] December 1st [16:20] Hmm. ipv6.google.com/sidewiki doesn't work; plus, you'd need a list of domains that have sidewiki entries, or do some random querying. [21:50] Hi guys, let me quickly repeat this: Please run the me.com/mac.com download script, if you can. There's a lot of stuff to download. http://www.archiveteam.org/index.php?title=MobileMe [21:56] okay! [22:05] Actually, hmm, it doesn't always work. [22:06] public.me.com, gallery.me.com, homepage.mac.com do, but web.me.com is hard. [22:11] When you're truly ready, alard, let's talk about it and then I'll get the word out. [22:13] Yeah, that's probably better. :) Just found out that most sites do okay, but some aren't. iWeb is really tricky: you can ask for a file listing of the entire site, but for some reason that does not include the iWeb files. [22:14] should I halt my process? [22:15] No, most things do work. [22:15] okay cool [22:15] It's just the web.me.com that's sometimes missing. [22:16] So you'll probably have redo those bits later, but the bulk of the data is on public and gallery.me.com. [22:16] And it's very helpful if you run it too, since that may produce new errors that I don't get. [22:27] should the mac.com page be added to the news on the main page? [22:29] alard: have there been any changes to the without-warctools branch? I've already got that built as of last week [22:30] Yes, a few, I didn't commit (nor push any commits to github). The latest tar.gz is the best version. [22:30] One thing that's fixed in there is a memory leak if you have GnuTLS. [22:31] uh... [22:31] does that leak affect http? [22:31] (with warc0 [22:31] No, but it is a problem (at least for me) for https://public.me.com/. [22:32] i was just wondering because I managed to get oom'd on a fetch attempt last week [22:32] When I tried downloading a user with a lot of public files, wget ran out of memory. It initializes the SSL library for each download but never de-initializes the previous one. [22:32] I don't think that it has anything to do with http. [22:33] Maybe your problems have more to do with the way wget stores the lists of files? [22:33] most likely [22:34] I've been trying to get wget to store these lists in a Berkeley DB database. [22:34] i was considering hacking in a simple storage system [22:35] That may be something. It stores the urls several times: in a queue, but there are also three or four lists with things it has done. [22:36] I've added bdb-storage to the queue, doesn't really solve the problem, most of the weight seems to be in the other lists. [22:44] btw, du -hs won't always give you the actual size of the items. zfs with compression will show the compressed size, not the uncompressed size [22:46] --apparent should fix that [22:55] Ah, okay, didn't know that. I'll change that. [22:56] (du shows actual disk size, so it would also mis-report sparse files, unless --apparent is used) [22:57] The web.me.com download is becoming rather inefficient, by the way: first, do a pass with --mirror to see if there's anything iWeb-like. Second, look for feed.xml in every directory. Third, download everything again, now with the pages from feed.xml (and the files from the webdav index). [22:57] alard: do you have chroot /mnt/sdcard/foot /bin/sh [22:57] cd / [22:57] export PATH=$PATH:/bin [22:57] ls [22:57] erm crap [22:57] wrong fucking clipboard [22:58] alard: do you have username 'trickey'? if not, throw my friend trillian into the hopper [22:58] chronomex: No. :) [22:59] I do have a 'lancetrickey', but no 'trickey'. [23:00] ok [23:00] I can't think of anyone else I know with a mac.com account [23:00] oh I know [23:00] brb [23:01] what about bdemoss? [23:02] though he's stopped using homepage.mac.com a couple years ago [23:02] bdemoss, yes. [23:03] so much broken stuff on that site, including apple-provided hit counters and stuff [23:04] hmm [23:04] will this grab quicktime movies and stuff? [23:04] It should, but please check. [23:05] tossing brad's username at it [23:06] hmm.. when I try manually in firefox, I get a 403 on this video :-\ [23:09] your cleanup script for the xml file list is missing a tag [23:10] one of the urls.txt lines is: http://web.me.com/bdemoss [23:17] bleh [23:17] wget doesn [23:17] er [23:18] wget doesn't seem to handle object tags, so it didn't even try to fetch the quicktime files [23:19] at least not from the homepage side. I don't know if the files would have shown up elsewhere if they are still there [23:19] Ah, on a normal website, I thought you meant quicktime mov in the gallery. [23:23] The should be fixed now. [23:37] Hi Jason, [23:37] I'm a friend of Matt Schwartz's working on a profile of the Archive Team for Technology Review. After reading an article about Gmail hackers in the Atlantic last week, I've developed a belated interest in the importance of backing up files locally/not trusting the cloud, so I'm definitely sympathetic to your cause. Also, my parents are hoarders (as in a house, car, and a warehouse packed with stuff), for whatever that's worth. Can we talk or meetu [23:38] clipped at "Can we talk or meetu" [23:44] ...some time. [23:45] So there's that. [23:45] Archive Team online presentation in Brussels [23:48] Coderjoe, chronomex: The script should now download more of iWeb, so if you could do a git pull ... [23:49] can I pull while running or should I stop first? [23:51] does your heroku tracker have a means to release a username given for a request?