#archiveteam 2011-11-02,Wed

↑back Search

Time Nickname Message
00:07 🔗 _shane_ Did the file command against the file file www.geocities.com and discovered it's a tar file :)
00:07 🔗 _shane_ I'm now on my way...
00:07 🔗 chronomex :)
01:09 🔗 SketchCow Yahoo Video continues its devastating death march into archive.org
01:09 🔗 SketchCow I am still working on #jsmess
01:13 🔗 DFJustin wow, came across a reference to an obscure, 100-year old book in my genealogy research, go to archive.org, BAM full pdf http://www.archive.org/details/recordlambton00beeruoft
05:53 🔗 chronomex DFJustin: 's good shit innit
05:55 🔗 inv nice DFJustin :)
06:14 🔗 inv lol SketchCow, you're a crazy motherfucker :)
06:15 🔗 inv how much space does archive.org occupy atm? it sounds like a lot of TBs are going up pretty much all the time
06:45 🔗 BlueMax There was an article somewhere that said archive.org loses two or three hard drives a day
06:46 🔗 SketchCow Archive.org is in the top single digits of petabytes.
06:50 🔗 BlueMax That's a lotta terabytes.
06:50 🔗 BlueMax Over 9000? :P
07:15 🔗 chronomex maaaaybe
07:18 🔗 BlueMax Wow, so secretive
13:53 🔗 lowtekk anyone have experience with Catweasel cards?
13:54 🔗 lowtekk someone (awesome) gave me a minty fresh amiga 2000 and naturally i need to make some disks
14:08 🔗 DFJustin as I understand it kryoflux is better than catweasel
14:13 🔗 lowtekk excellent, ill look into that
14:13 🔗 lowtekk thanks
14:16 🔗 SketchCow Moooorning
14:16 🔗 SketchCow Kryoflux is much better than catweasel. Period.
14:17 🔗 lowtekk do you gentlemen have any interest in "scene" pirated amiga games, etc
14:18 🔗 lowtekk all legal/piracy concerns aside, are these early loaders/cracks (sweet music included) of interest? i find them fascinating :)
14:19 🔗 ersi Everything is interesting
14:20 🔗 _shane_ Sold my Catweasel very unimpressed. each firmware upgrade lost a feature i needed and it could never read write a c64 disk only work with images.
14:22 🔗 SketchCow I think it's so cute how you say "all legal/piracy concerns aside"
14:22 🔗 SketchCow I think you need to stop saying that.
14:23 🔗 SketchCow I think you need to concentrate on acquiring material before it disappears forever.
14:23 🔗 SketchCow Also, I can save you time.
14:24 🔗 SketchCow When you say "Are these early ___digital_items____ of interest", you're asking the wrong question.
14:25 🔗 SketchCow The question is actually "I have ___size_and_quantity___ of early ___digital_items___ available - where do I put them?"
14:28 🔗 lowtekk understood, thanks :)
14:29 🔗 SketchCow Now, ask your question.
14:30 🔗 lowtekk im going to make an attempt to image a large stack of amiga disks, of mostly pirated games
14:30 🔗 lowtekk of the sample i tested last night, a majority of the disks still work
14:31 🔗 SketchCow After you buy a kryoflux.
14:31 🔗 SketchCow Be sure to buy a TEAC drive too
14:31 🔗 lowtekk FD-235-ish?
14:34 🔗 lowtekk i've got sony's, alps, and epson drives kicking around, i need to look into this more
14:45 🔗 SketchCow http://forum.kryoflux.com/viewtopic.php?f=3&t=4 is a list
15:49 🔗 SketchCow http://code.google.com/apis/sidewiki/docs/2.0/reference_guide.html#Feeds
15:49 🔗 SketchCow Well now.
15:52 🔗 alard Is Sidewiki closing?
15:53 🔗 SketchCow Yes
15:53 🔗 SketchCow December 1st
16:20 🔗 alard Hmm. ipv6.google.com/sidewiki doesn't work; plus, you'd need a list of domains that have sidewiki entries, or do some random querying.
21:50 🔗 alard Hi guys, let me quickly repeat this: Please run the me.com/mac.com download script, if you can. There's a lot of stuff to download. http://www.archiveteam.org/index.php?title=MobileMe
21:56 🔗 chronomex okay!
22:05 🔗 alard Actually, hmm, it doesn't always work.
22:06 🔗 alard public.me.com, gallery.me.com, homepage.mac.com do, but web.me.com is hard.
22:11 🔗 SketchCow When you're truly ready, alard, let's talk about it and then I'll get the word out.
22:13 🔗 alard Yeah, that's probably better. :) Just found out that most sites do okay, but some aren't. iWeb is really tricky: you can ask for a file listing of the entire site, but for some reason that does not include the iWeb files.
22:14 🔗 chronomex should I halt my process?
22:15 🔗 alard No, most things do work.
22:15 🔗 chronomex okay cool
22:15 🔗 alard It's just the web.me.com that's sometimes missing.
22:16 🔗 alard So you'll probably have redo those bits later, but the bulk of the data is on public and gallery.me.com.
22:16 🔗 alard And it's very helpful if you run it too, since that may produce new errors that I don't get.
22:27 🔗 Coderjoe should the mac.com page be added to the news on the main page?
22:29 🔗 Coderjoe alard: have there been any changes to the without-warctools branch? I've already got that built as of last week
22:30 🔗 alard Yes, a few, I didn't commit (nor push any commits to github). The latest tar.gz is the best version.
22:30 🔗 alard One thing that's fixed in there is a memory leak if you have GnuTLS.
22:31 🔗 Coderjoe uh...
22:31 🔗 Coderjoe does that leak affect http?
22:31 🔗 Coderjoe (with warc0
22:31 🔗 alard No, but it is a problem (at least for me) for https://public.me.com/.
22:32 🔗 Coderjoe i was just wondering because I managed to get oom'd on a fetch attempt last week
22:32 🔗 alard When I tried downloading a user with a lot of public files, wget ran out of memory. It initializes the SSL library for each download but never de-initializes the previous one.
22:32 🔗 alard I don't think that it has anything to do with http.
22:33 🔗 alard Maybe your problems have more to do with the way wget stores the lists of files?
22:33 🔗 Coderjoe most likely
22:34 🔗 alard I've been trying to get wget to store these lists in a Berkeley DB database.
22:34 🔗 Coderjoe i was considering hacking in a simple storage system
22:35 🔗 alard That may be something. It stores the urls several times: in a queue, but there are also three or four lists with things it has done.
22:36 🔗 alard I've added bdb-storage to the queue, doesn't really solve the problem, most of the weight seems to be in the other lists.
22:44 🔗 Coderjoe btw, du -hs won't always give you the actual size of the items. zfs with compression will show the compressed size, not the uncompressed size
22:46 🔗 chronomex --apparent should fix that
22:55 🔗 alard Ah, okay, didn't know that. I'll change that.
22:56 🔗 Coderjoe (du shows actual disk size, so it would also mis-report sparse files, unless --apparent is used)
22:57 🔗 alard The web.me.com download is becoming rather inefficient, by the way: first, do a pass with --mirror to see if there's anything iWeb-like. Second, look for feed.xml in every directory. Third, download everything again, now with the pages from feed.xml (and the files from the webdav index).
22:57 🔗 chronomex alard: do you have chroot /mnt/sdcard/foot /bin/sh
22:57 🔗 chronomex cd /
22:57 🔗 chronomex export PATH=$PATH:/bin
22:57 🔗 chronomex ls
22:57 🔗 chronomex erm crap
22:57 🔗 chronomex wrong fucking clipboard
22:58 🔗 chronomex alard: do you have username 'trickey'? if not, throw my friend trillian into the hopper
22:58 🔗 alard chronomex: No. :)
22:59 🔗 alard I do have a 'lancetrickey', but no 'trickey'.
23:00 🔗 chronomex ok
23:00 🔗 chronomex I can't think of anyone else I know with a mac.com account
23:00 🔗 chronomex oh I know
23:00 🔗 chronomex brb
23:01 🔗 Coderjoe what about bdemoss?
23:02 🔗 Coderjoe though he's stopped using homepage.mac.com a couple years ago
23:02 🔗 alard bdemoss, yes.
23:03 🔗 Coderjoe so much broken stuff on that site, including apple-provided hit counters and stuff
23:04 🔗 Coderjoe hmm
23:04 🔗 Coderjoe will this grab quicktime movies and stuff?
23:04 🔗 alard It should, but please check.
23:05 🔗 Coderjoe tossing brad's username at it
23:06 🔗 Coderjoe hmm.. when I try manually in firefox, I get a 403 on this video :-\
23:09 🔗 Coderjoe your cleanup script for the xml file list is missing a tag
23:10 🔗 Coderjoe one of the urls.txt lines is: http://web.me.com/bdemoss</id>
23:17 🔗 Coderjoe bleh
23:17 🔗 Coderjoe wget doesn
23:17 🔗 Coderjoe er
23:18 🔗 Coderjoe wget doesn't seem to handle object tags, so it didn't even try to fetch the quicktime files
23:19 🔗 Coderjoe at least not from the homepage side. I don't know if the files would have shown up elsewhere if they are still there
23:19 🔗 alard Ah, on a normal website, I thought you meant quicktime mov in the gallery.
23:23 🔗 alard The </id> should be fixed now.
23:37 🔗 SketchCow Hi Jason,
23:37 🔗 SketchCow I'm a friend of Matt Schwartz's working on a profile of the Archive Team for Technology Review. After reading an article about Gmail hackers in the Atlantic last week, I've developed a belated interest in the importance of backing up files locally/not trusting the cloud, so I'm definitely sympathetic to your cause. Also, my parents are hoarders (as in a house, car, and a warehouse packed with stuff), for whatever that's worth. Can we talk or meetu
23:38 🔗 Coderjoe clipped at "Can we talk or meetu"
23:44 🔗 SketchCow ...some time.
23:45 🔗 SketchCow So there's that.
23:45 🔗 SketchCow Archive Team online presentation in Brussels
23:48 🔗 alard Coderjoe, chronomex: The script should now download more of iWeb, so if you could do a git pull ...
23:49 🔗 chronomex can I pull while running or should I stop first?
23:51 🔗 Coderjoe does your heroku tracker have a means to release a username given for a request?

irclogger-viewer