#archiveteam 2012-12-29,Sat

↑back Search

Time Nickname Message
09:20 🔗 kennethre is there any way to upload something to upload something to archive.org without creative commons?
09:21 🔗 DFJustin sure
09:21 🔗 kennethre i see, it's optional
09:27 🔗 kennethre there's no generic 'data' category?
09:27 🔗 kennethre has to be audio, movie, or text?
09:29 🔗 Coderjoe you're using the form, aren't you?
09:29 🔗 kennethre yes
09:29 🔗 kennethre is there an api?
09:29 🔗 kennethre sorry, i've never really investigated this before :)
09:29 🔗 Coderjoe there are other categories, just not available through the web form
09:30 🔗 kennethre ah excellent
09:30 🔗 Coderjoe http://archive.org/help/abouts3.txt
09:30 🔗 kennethre oh god, perfect
09:30 🔗 kennethre thank you
09:32 🔗 kennethre i'm building a 'blackbox' system for everything i ever create
09:32 🔗 kennethre and the goal is for it to be as permanent as possible
09:33 🔗 Coderjoe however, unless you are an admin, you can only upload to one of a few collections
09:33 🔗 kennethre Coderjoe: wonder if i can get a collection added for myself
09:33 🔗 Coderjoe (which the web form picked via the category you chose)
09:34 🔗 kennethre that'd be ideal
09:49 🔗 kennethre ideally i'll have a warc for everything too
09:49 🔗 kennethre but we'll see
10:10 🔗 chronomex Coderjoe: you can be added to the approve list for a collection, of course
10:27 🔗 Nemo_bis mediatype can be set to anything by anyone
10:27 🔗 godane i'm starting to hate the speed of ftp
10:27 🔗 Nemo_bis godane: only now?
10:28 🔗 godane it normally works fine
10:28 🔗 Nemo_bis No. It doesn't.
10:28 🔗 godane for me it does
10:28 🔗 godane but ever so often the speed becomes very slow
10:28 🔗 Nemo_bis Maybe you're the only user left. https://archive.org/~tracey/mrtg/ftp.html
10:29 🔗 Nemo_bis Every time a single other person tries to use it, you're both ruined. ;)
10:29 🔗 Famicoman I'm using it
10:30 🔗 godane i'm not that good with the scripting uploads to s3
10:30 🔗 Famicoman I kept getting errors that the drive was full earler
10:30 🔗 kennethre is there anyone here i should bother for a 'kennethreitz' collection, or should i go through the normal process?
10:30 🔗 kennethre /cc @chronomex
10:30 🔗 chronomex hi
10:31 🔗 chronomex I think underscor or SketchCow are the people to ask
10:31 🔗 kennethre /cc underscor :)
11:40 🔗 godane i think s3 is very slow too
11:41 🔗 godane not just ftp
11:41 🔗 SketchCow What does this collection have?
11:42 🔗 GLaDOS WARCs of everything he's done.
12:14 🔗 Coderjoe what the
12:14 🔗 Coderjoe the ia donate page no longer has the 3-to-1 match blurb
12:15 🔗 ersi That's unfortunate, because maybe there's a few holding out to the absolute last day for some reason
12:15 🔗 Coderjoe the amounts reflect it, and the blog post about it says it goes to the 31st
12:16 🔗 Coderjoe but the progress meter is gone
12:17 🔗 Famicoman maybe the goal was reached?
12:18 🔗 ersi It was lacking 17k yesterday
12:18 🔗 Famicoman ah, doubtful then
12:55 🔗 * SmileyG looks in
13:13 🔗 kennethre SketchCow: i'm working on a continual archive of everything i create, including articles, tweets, photos, music, etc
13:13 🔗 kennethre SketchCow: the plan is to have it back itself up to archive.org in case I have an untimely demise :)
13:18 🔗 kennethre it's coming along quite nicely so far
13:18 🔗 kennethre http://blackbox.kennethreitz.org/records/1e7f3c62-96e4-4be4-a26a-f62c61ce939d
13:18 🔗 kennethre http://blackbox.kennethreitz.org/records/1e7f3c62-96e4-4be4-a26a-f62c61ce939d/download
14:03 🔗 * Nemo_bis has 1200 tasks waiting for admin. :/
15:45 🔗 push i think web archive should open up old 90s versions of sites, it sucks now that some domains seem to be totally gone due to a NEW robots.txt put on the active site?
15:45 🔗 ersi bla bla bla whine old bla bla
15:45 🔗 ersi It's been iterated over a billion times already.
15:47 🔗 push ah sorry, didnt think about that
15:48 🔗 ersi But I agree that it's unfortunate that some new owner of a domain can make the previous owners data hidden in the Wayback Machine.
15:49 🔗 ersi There's a lot of data public for what I know, look in the crawldata collection @ IA. It's not everything though, I think. And besides, the data will continue to exist - it's just hidden/darkened (until it's public again, if IA undarks or robots.txt goes away)
15:51 🔗 push yeah, theres still a chance to see some of it some time later i guess
15:51 🔗 push it hasnt been a huge thing or anything, only a few sites
15:51 🔗 ersi Yeah, but it comes up so often it makes me almost angry everytime it comes up
15:52 🔗 push i have had a similar reaction :P
15:52 🔗 ersi ^_^
15:53 🔗 push it's hard to solve though i would think, sometimes a legitimate owner wants to block the whole history and i reckon he should be able to
15:53 🔗 push i think other times they dont even know about IA maybe
15:54 🔗 push some have forbidden everything by default and it seems senseless
15:54 🔗 ersi I know that the Wayback Machine does a HTTP GET on the robots.txt when it's going to serve something from a crawled domain - everytime
15:54 🔗 push ah
15:55 🔗 ersi Maybe I'm wrong, but I have a faint memory of that from fiddling with the code and trying to set Wayback Machine up (http://github.com/internetarchive/wayback/)
15:57 🔗 push guess it can also be tested, i have a couple old domains indexed i could set them up again and do before/after robots.txt
15:57 🔗 push but it does feel that way
15:57 🔗 push it was restrictive just earlier, a site is blocked and i was totally excited to see it
15:57 🔗 push some very old site
15:57 🔗 push brb
15:57 🔗 push ehe
15:59 🔗 ersi Yeah, sucks when you run into the problem
16:43 🔗 SketchCow That's an interesting tactic, kennethre
16:43 🔗 kennethre SketchCow: thanks, i like it more the longer i think about it
17:47 🔗 tef push: archive should have old copies of robots.txt ?
19:25 🔗 balrog_ anyone here familiar with archiving yahoo groups?
19:25 🔗 balrog_ I found this tool: http://grabyahoogroup.sourceforge.net
20:12 🔗 balrog_ it's giving me error 500s though

irclogger-viewer