Time |
Nickname |
Message |
09:20
🔗
|
kennethre |
is there any way to upload something to upload something to archive.org without creative commons? |
09:21
🔗
|
DFJustin |
sure |
09:21
🔗
|
kennethre |
i see, it's optional |
09:27
🔗
|
kennethre |
there's no generic 'data' category? |
09:27
🔗
|
kennethre |
has to be audio, movie, or text? |
09:29
🔗
|
Coderjoe |
you're using the form, aren't you? |
09:29
🔗
|
kennethre |
yes |
09:29
🔗
|
kennethre |
is there an api? |
09:29
🔗
|
kennethre |
sorry, i've never really investigated this before :) |
09:29
🔗
|
Coderjoe |
there are other categories, just not available through the web form |
09:30
🔗
|
kennethre |
ah excellent |
09:30
🔗
|
Coderjoe |
http://archive.org/help/abouts3.txt |
09:30
🔗
|
kennethre |
oh god, perfect |
09:30
🔗
|
kennethre |
thank you |
09:32
🔗
|
kennethre |
i'm building a 'blackbox' system for everything i ever create |
09:32
🔗
|
kennethre |
and the goal is for it to be as permanent as possible |
09:33
🔗
|
Coderjoe |
however, unless you are an admin, you can only upload to one of a few collections |
09:33
🔗
|
kennethre |
Coderjoe: wonder if i can get a collection added for myself |
09:33
🔗
|
Coderjoe |
(which the web form picked via the category you chose) |
09:34
🔗
|
kennethre |
that'd be ideal |
09:49
🔗
|
kennethre |
ideally i'll have a warc for everything too |
09:49
🔗
|
kennethre |
but we'll see |
10:10
🔗
|
chronomex |
Coderjoe: you can be added to the approve list for a collection, of course |
10:27
🔗
|
Nemo_bis |
mediatype can be set to anything by anyone |
10:27
🔗
|
godane |
i'm starting to hate the speed of ftp |
10:27
🔗
|
Nemo_bis |
godane: only now? |
10:28
🔗
|
godane |
it normally works fine |
10:28
🔗
|
Nemo_bis |
No. It doesn't. |
10:28
🔗
|
godane |
for me it does |
10:28
🔗
|
godane |
but ever so often the speed becomes very slow |
10:28
🔗
|
Nemo_bis |
Maybe you're the only user left. https://archive.org/~tracey/mrtg/ftp.html |
10:29
🔗
|
Nemo_bis |
Every time a single other person tries to use it, you're both ruined. ;) |
10:29
🔗
|
Famicoman |
I'm using it |
10:30
🔗
|
godane |
i'm not that good with the scripting uploads to s3 |
10:30
🔗
|
Famicoman |
I kept getting errors that the drive was full earler |
10:30
🔗
|
kennethre |
is there anyone here i should bother for a 'kennethreitz' collection, or should i go through the normal process? |
10:30
🔗
|
kennethre |
/cc @chronomex |
10:30
🔗
|
chronomex |
hi |
10:31
🔗
|
chronomex |
I think underscor or SketchCow are the people to ask |
10:31
🔗
|
kennethre |
/cc underscor :) |
11:40
🔗
|
godane |
i think s3 is very slow too |
11:41
🔗
|
godane |
not just ftp |
11:41
🔗
|
SketchCow |
What does this collection have? |
11:42
🔗
|
GLaDOS |
WARCs of everything he's done. |
12:14
🔗
|
Coderjoe |
what the |
12:14
🔗
|
Coderjoe |
the ia donate page no longer has the 3-to-1 match blurb |
12:15
🔗
|
ersi |
That's unfortunate, because maybe there's a few holding out to the absolute last day for some reason |
12:15
🔗
|
Coderjoe |
the amounts reflect it, and the blog post about it says it goes to the 31st |
12:16
🔗
|
Coderjoe |
but the progress meter is gone |
12:17
🔗
|
Famicoman |
maybe the goal was reached? |
12:18
🔗
|
ersi |
It was lacking 17k yesterday |
12:18
🔗
|
Famicoman |
ah, doubtful then |
12:55
🔗
|
* |
SmileyG looks in |
13:13
🔗
|
kennethre |
SketchCow: i'm working on a continual archive of everything i create, including articles, tweets, photos, music, etc |
13:13
🔗
|
kennethre |
SketchCow: the plan is to have it back itself up to archive.org in case I have an untimely demise :) |
13:18
🔗
|
kennethre |
it's coming along quite nicely so far |
13:18
🔗
|
kennethre |
http://blackbox.kennethreitz.org/records/1e7f3c62-96e4-4be4-a26a-f62c61ce939d |
13:18
🔗
|
kennethre |
http://blackbox.kennethreitz.org/records/1e7f3c62-96e4-4be4-a26a-f62c61ce939d/download |
14:03
🔗
|
* |
Nemo_bis has 1200 tasks waiting for admin. :/ |
15:45
🔗
|
push |
i think web archive should open up old 90s versions of sites, it sucks now that some domains seem to be totally gone due to a NEW robots.txt put on the active site? |
15:45
🔗
|
ersi |
bla bla bla whine old bla bla |
15:45
🔗
|
ersi |
It's been iterated over a billion times already. |
15:47
🔗
|
push |
ah sorry, didnt think about that |
15:48
🔗
|
ersi |
But I agree that it's unfortunate that some new owner of a domain can make the previous owners data hidden in the Wayback Machine. |
15:49
🔗
|
ersi |
There's a lot of data public for what I know, look in the crawldata collection @ IA. It's not everything though, I think. And besides, the data will continue to exist - it's just hidden/darkened (until it's public again, if IA undarks or robots.txt goes away) |
15:51
🔗
|
push |
yeah, theres still a chance to see some of it some time later i guess |
15:51
🔗
|
push |
it hasnt been a huge thing or anything, only a few sites |
15:51
🔗
|
ersi |
Yeah, but it comes up so often it makes me almost angry everytime it comes up |
15:52
🔗
|
push |
i have had a similar reaction :P |
15:52
🔗
|
ersi |
^_^ |
15:53
🔗
|
push |
it's hard to solve though i would think, sometimes a legitimate owner wants to block the whole history and i reckon he should be able to |
15:53
🔗
|
push |
i think other times they dont even know about IA maybe |
15:54
🔗
|
push |
some have forbidden everything by default and it seems senseless |
15:54
🔗
|
ersi |
I know that the Wayback Machine does a HTTP GET on the robots.txt when it's going to serve something from a crawled domain - everytime |
15:54
🔗
|
push |
ah |
15:55
🔗
|
ersi |
Maybe I'm wrong, but I have a faint memory of that from fiddling with the code and trying to set Wayback Machine up (http://github.com/internetarchive/wayback/) |
15:57
🔗
|
push |
guess it can also be tested, i have a couple old domains indexed i could set them up again and do before/after robots.txt |
15:57
🔗
|
push |
but it does feel that way |
15:57
🔗
|
push |
it was restrictive just earlier, a site is blocked and i was totally excited to see it |
15:57
🔗
|
push |
some very old site |
15:57
🔗
|
push |
brb |
15:57
🔗
|
push |
ehe |
15:59
🔗
|
ersi |
Yeah, sucks when you run into the problem |
16:43
🔗
|
SketchCow |
That's an interesting tactic, kennethre |
16:43
🔗
|
kennethre |
SketchCow: thanks, i like it more the longer i think about it |
17:47
🔗
|
tef |
push: archive should have old copies of robots.txt ? |
19:25
🔗
|
balrog_ |
anyone here familiar with archiving yahoo groups? |
19:25
🔗
|
balrog_ |
I found this tool: http://grabyahoogroup.sourceforge.net |
20:12
🔗
|
balrog_ |
it's giving me error 500s though |