Time |
Nickname |
Message |
00:07
🔗
|
_shane_ |
Did the file command against the file file www.geocities.com and discovered it's a tar file :) |
00:07
🔗
|
_shane_ |
I'm now on my way... |
00:07
🔗
|
chronomex |
:) |
01:09
🔗
|
SketchCow |
Yahoo Video continues its devastating death march into archive.org |
01:09
🔗
|
SketchCow |
I am still working on #jsmess |
01:13
🔗
|
DFJustin |
wow, came across a reference to an obscure, 100-year old book in my genealogy research, go to archive.org, BAM full pdf http://www.archive.org/details/recordlambton00beeruoft |
05:53
🔗
|
chronomex |
DFJustin: 's good shit innit |
05:55
🔗
|
inv |
nice DFJustin :) |
06:14
🔗
|
inv |
lol SketchCow, you're a crazy motherfucker :) |
06:15
🔗
|
inv |
how much space does archive.org occupy atm? it sounds like a lot of TBs are going up pretty much all the time |
06:45
🔗
|
BlueMax |
There was an article somewhere that said archive.org loses two or three hard drives a day |
06:46
🔗
|
SketchCow |
Archive.org is in the top single digits of petabytes. |
06:50
🔗
|
BlueMax |
That's a lotta terabytes. |
06:50
🔗
|
BlueMax |
Over 9000? :P |
07:15
🔗
|
chronomex |
maaaaybe |
07:18
🔗
|
BlueMax |
Wow, so secretive |
13:53
🔗
|
lowtekk |
anyone have experience with Catweasel cards? |
13:54
🔗
|
lowtekk |
someone (awesome) gave me a minty fresh amiga 2000 and naturally i need to make some disks |
14:08
🔗
|
DFJustin |
as I understand it kryoflux is better than catweasel |
14:13
🔗
|
lowtekk |
excellent, ill look into that |
14:13
🔗
|
lowtekk |
thanks |
14:16
🔗
|
SketchCow |
Moooorning |
14:16
🔗
|
SketchCow |
Kryoflux is much better than catweasel. Period. |
14:17
🔗
|
lowtekk |
do you gentlemen have any interest in "scene" pirated amiga games, etc |
14:18
🔗
|
lowtekk |
all legal/piracy concerns aside, are these early loaders/cracks (sweet music included) of interest? i find them fascinating :) |
14:19
🔗
|
ersi |
Everything is interesting |
14:20
🔗
|
_shane_ |
Sold my Catweasel very unimpressed. each firmware upgrade lost a feature i needed and it could never read write a c64 disk only work with images. |
14:22
🔗
|
SketchCow |
I think it's so cute how you say "all legal/piracy concerns aside" |
14:22
🔗
|
SketchCow |
I think you need to stop saying that. |
14:23
🔗
|
SketchCow |
I think you need to concentrate on acquiring material before it disappears forever. |
14:23
🔗
|
SketchCow |
Also, I can save you time. |
14:24
🔗
|
SketchCow |
When you say "Are these early ___digital_items____ of interest", you're asking the wrong question. |
14:25
🔗
|
SketchCow |
The question is actually "I have ___size_and_quantity___ of early ___digital_items___ available - where do I put them?" |
14:28
🔗
|
lowtekk |
understood, thanks :) |
14:29
🔗
|
SketchCow |
Now, ask your question. |
14:30
🔗
|
lowtekk |
im going to make an attempt to image a large stack of amiga disks, of mostly pirated games |
14:30
🔗
|
lowtekk |
of the sample i tested last night, a majority of the disks still work |
14:31
🔗
|
SketchCow |
After you buy a kryoflux. |
14:31
🔗
|
SketchCow |
Be sure to buy a TEAC drive too |
14:31
🔗
|
lowtekk |
FD-235-ish? |
14:34
🔗
|
lowtekk |
i've got sony's, alps, and epson drives kicking around, i need to look into this more |
14:45
🔗
|
SketchCow |
http://forum.kryoflux.com/viewtopic.php?f=3&t=4 is a list |
15:49
🔗
|
SketchCow |
http://code.google.com/apis/sidewiki/docs/2.0/reference_guide.html#Feeds |
15:49
🔗
|
SketchCow |
Well now. |
15:52
🔗
|
alard |
Is Sidewiki closing? |
15:53
🔗
|
SketchCow |
Yes |
15:53
🔗
|
SketchCow |
December 1st |
16:20
🔗
|
alard |
Hmm. ipv6.google.com/sidewiki doesn't work; plus, you'd need a list of domains that have sidewiki entries, or do some random querying. |
21:50
🔗
|
alard |
Hi guys, let me quickly repeat this: Please run the me.com/mac.com download script, if you can. There's a lot of stuff to download. http://www.archiveteam.org/index.php?title=MobileMe |
21:56
🔗
|
chronomex |
okay! |
22:05
🔗
|
alard |
Actually, hmm, it doesn't always work. |
22:06
🔗
|
alard |
public.me.com, gallery.me.com, homepage.mac.com do, but web.me.com is hard. |
22:11
🔗
|
SketchCow |
When you're truly ready, alard, let's talk about it and then I'll get the word out. |
22:13
🔗
|
alard |
Yeah, that's probably better. :) Just found out that most sites do okay, but some aren't. iWeb is really tricky: you can ask for a file listing of the entire site, but for some reason that does not include the iWeb files. |
22:14
🔗
|
chronomex |
should I halt my process? |
22:15
🔗
|
alard |
No, most things do work. |
22:15
🔗
|
chronomex |
okay cool |
22:15
🔗
|
alard |
It's just the web.me.com that's sometimes missing. |
22:16
🔗
|
alard |
So you'll probably have redo those bits later, but the bulk of the data is on public and gallery.me.com. |
22:16
🔗
|
alard |
And it's very helpful if you run it too, since that may produce new errors that I don't get. |
22:27
🔗
|
Coderjoe |
should the mac.com page be added to the news on the main page? |
22:29
🔗
|
Coderjoe |
alard: have there been any changes to the without-warctools branch? I've already got that built as of last week |
22:30
🔗
|
alard |
Yes, a few, I didn't commit (nor push any commits to github). The latest tar.gz is the best version. |
22:30
🔗
|
alard |
One thing that's fixed in there is a memory leak if you have GnuTLS. |
22:31
🔗
|
Coderjoe |
uh... |
22:31
🔗
|
Coderjoe |
does that leak affect http? |
22:31
🔗
|
Coderjoe |
(with warc0 |
22:31
🔗
|
alard |
No, but it is a problem (at least for me) for https://public.me.com/. |
22:32
🔗
|
Coderjoe |
i was just wondering because I managed to get oom'd on a fetch attempt last week |
22:32
🔗
|
alard |
When I tried downloading a user with a lot of public files, wget ran out of memory. It initializes the SSL library for each download but never de-initializes the previous one. |
22:32
🔗
|
alard |
I don't think that it has anything to do with http. |
22:33
🔗
|
alard |
Maybe your problems have more to do with the way wget stores the lists of files? |
22:33
🔗
|
Coderjoe |
most likely |
22:34
🔗
|
alard |
I've been trying to get wget to store these lists in a Berkeley DB database. |
22:34
🔗
|
Coderjoe |
i was considering hacking in a simple storage system |
22:35
🔗
|
alard |
That may be something. It stores the urls several times: in a queue, but there are also three or four lists with things it has done. |
22:36
🔗
|
alard |
I've added bdb-storage to the queue, doesn't really solve the problem, most of the weight seems to be in the other lists. |
22:44
🔗
|
Coderjoe |
btw, du -hs won't always give you the actual size of the items. zfs with compression will show the compressed size, not the uncompressed size |
22:46
🔗
|
chronomex |
--apparent should fix that |
22:55
🔗
|
alard |
Ah, okay, didn't know that. I'll change that. |
22:56
🔗
|
Coderjoe |
(du shows actual disk size, so it would also mis-report sparse files, unless --apparent is used) |
22:57
🔗
|
alard |
The web.me.com download is becoming rather inefficient, by the way: first, do a pass with --mirror to see if there's anything iWeb-like. Second, look for feed.xml in every directory. Third, download everything again, now with the pages from feed.xml (and the files from the webdav index). |
22:57
🔗
|
chronomex |
alard: do you have chroot /mnt/sdcard/foot /bin/sh |
22:57
🔗
|
chronomex |
cd / |
22:57
🔗
|
chronomex |
export PATH=$PATH:/bin |
22:57
🔗
|
chronomex |
ls |
22:57
🔗
|
chronomex |
erm crap |
22:57
🔗
|
chronomex |
wrong fucking clipboard |
22:58
🔗
|
chronomex |
alard: do you have username 'trickey'? if not, throw my friend trillian into the hopper |
22:58
🔗
|
alard |
chronomex: No. :) |
22:59
🔗
|
alard |
I do have a 'lancetrickey', but no 'trickey'. |
23:00
🔗
|
chronomex |
ok |
23:00
🔗
|
chronomex |
I can't think of anyone else I know with a mac.com account |
23:00
🔗
|
chronomex |
oh I know |
23:00
🔗
|
chronomex |
brb |
23:01
🔗
|
Coderjoe |
what about bdemoss? |
23:02
🔗
|
Coderjoe |
though he's stopped using homepage.mac.com a couple years ago |
23:02
🔗
|
alard |
bdemoss, yes. |
23:03
🔗
|
Coderjoe |
so much broken stuff on that site, including apple-provided hit counters and stuff |
23:04
🔗
|
Coderjoe |
hmm |
23:04
🔗
|
Coderjoe |
will this grab quicktime movies and stuff? |
23:04
🔗
|
alard |
It should, but please check. |
23:05
🔗
|
Coderjoe |
tossing brad's username at it |
23:06
🔗
|
Coderjoe |
hmm.. when I try manually in firefox, I get a 403 on this video :-\ |
23:09
🔗
|
Coderjoe |
your cleanup script for the xml file list is missing a tag |
23:10
🔗
|
Coderjoe |
one of the urls.txt lines is: http://web.me.com/bdemoss</id> |
23:17
🔗
|
Coderjoe |
bleh |
23:17
🔗
|
Coderjoe |
wget doesn |
23:17
🔗
|
Coderjoe |
er |
23:18
🔗
|
Coderjoe |
wget doesn't seem to handle object tags, so it didn't even try to fetch the quicktime files |
23:19
🔗
|
Coderjoe |
at least not from the homepage side. I don't know if the files would have shown up elsewhere if they are still there |
23:19
🔗
|
alard |
Ah, on a normal website, I thought you meant quicktime mov in the gallery. |
23:23
🔗
|
alard |
The </id> should be fixed now. |
23:37
🔗
|
SketchCow |
Hi Jason, |
23:37
🔗
|
SketchCow |
I'm a friend of Matt Schwartz's working on a profile of the Archive Team for Technology Review. After reading an article about Gmail hackers in the Atlantic last week, I've developed a belated interest in the importance of backing up files locally/not trusting the cloud, so I'm definitely sympathetic to your cause. Also, my parents are hoarders (as in a house, car, and a warehouse packed with stuff), for whatever that's worth. Can we talk or meetu |
23:38
🔗
|
Coderjoe |
clipped at "Can we talk or meetu" |
23:44
🔗
|
SketchCow |
...some time. |
23:45
🔗
|
SketchCow |
So there's that. |
23:45
🔗
|
SketchCow |
Archive Team online presentation in Brussels |
23:48
🔗
|
alard |
Coderjoe, chronomex: The script should now download more of iWeb, so if you could do a git pull ... |
23:49
🔗
|
chronomex |
can I pull while running or should I stop first? |
23:51
🔗
|
Coderjoe |
does your heroku tracker have a means to release a username given for a request? |