| Time |
Nickname |
Message |
|
00:07
🔗
|
godane |
uploaded: http://archive.org/details/dltv_068_episode |
|
00:07
🔗
|
godane |
:-D |
|
03:04
🔗
|
shaqfu |
Does Efnet have a good browser client, a la mibbit? |
|
03:04
🔗
|
shaqfu |
Might need to log on from work to catch up with Schbirid; we're on opposite shifts :( |
|
03:32
🔗
|
chronomex |
shaqfu: http://chat.efnet.org/ |
|
03:34
🔗
|
shaqfu |
chronomex: Thanks; even has a snazzy new interface |
|
03:34
🔗
|
shaqfu |
Hopefully it doesn't get caught in the NetNanny :( |
|
04:05
🔗
|
godane |
SketchCow: what the hell happen here: http://archive.org/details/20040229-bbs-lcrumbling |
|
04:05
🔗
|
godane |
the ogg video and 512kb mpeg4 is ~12mb |
|
04:05
🔗
|
godane |
when the source mpeg2 is 1.3gb |
|
04:06
🔗
|
godane |
i think those files need to recreate |
|
04:18
🔗
|
chronomex |
looks like |
|
04:31
🔗
|
godane |
maybe a fundrasier for jason scott to get all the bbs interviews up |
|
04:31
🔗
|
godane |
thiink maybe it will get him to do it |
|
05:00
🔗
|
Coderjoe |
wow |
|
05:00
🔗
|
Coderjoe |
that interview item has quite a long task history |
|
05:14
🔗
|
godane |
first 70 episodes of dl.tv is up |
|
05:33
🔗
|
DFJustin |
the video derivatives can be a lot smaller if the source is HD because it downscales |
|
05:34
🔗
|
DFJustin |
that said it doesn't seek properly in this case so something probably went wrong |
|
06:30
🔗
|
closure |
awesome, I just managed to use _Creating GeoCities Websites_ to round out an Amazon order and get free shipping |
|
06:32
🔗
|
Coderjoe |
yes, something went horribly wrong in the last derive.php process 3.6 years ago. it deleted a lot of thumbnails and the other derivatives and then, for some reason (ffmpeg bug? I don't see an error message in the log file) only created 4 thumbnail images and the short incomplete video files. |
|
06:32
🔗
|
chronomex |
yeah, I didn't see anything indicative of badness either |
|
06:35
🔗
|
godane |
Coderjoe: May explain way most mp3s for bbs docs are under 1kbyte too |
|
06:35
🔗
|
godane |
the ogg files look fine |
|
06:35
🔗
|
godane |
for the size |
|
06:39
🔗
|
underscor |
SketchCow will have to tap that item. I don't have permissions for that collection to trigger a rederive |
|
06:40
🔗
|
chronomex |
I'd tap that |
|
06:41
🔗
|
godane |
there is more that needs tapping |
|
06:42
🔗
|
underscor |
Actually. |
|
06:42
🔗
|
underscor |
I just got it through the secret backdoor way |
|
06:42
🔗
|
underscor |
http://archive.org/catalog.php?history=1&identifier=20040229-bbs-lcrumbling |
|
06:42
🔗
|
underscor |
Didn't know that worked. Neat. |
|
06:42
🔗
|
chronomex |
the internet archive glory hole system |
|
06:43
🔗
|
godane |
bad mp3: http://archive.org/details/1993-bbs-tbbstape |
|
06:44
🔗
|
godane |
mp3 file is 816.0b |
|
06:44
🔗
|
godane |
i want that compress if it worked |
|
06:44
🔗
|
chronomex |
haha |
|
06:44
🔗
|
underscor |
ok, let me make sure this derive runs properly first |
|
06:44
🔗
|
godane |
we all cause have the internet archive audio version on a hard drive |
|
06:45
🔗
|
underscor |
I don't want to fuck more than one item if the method I queued it with is fucked |
|
06:45
🔗
|
chronomex |
godane: what does that mean? I can't understand you |
|
06:45
🔗
|
underscor |
if mp3 really compressed that small |
|
06:45
🔗
|
underscor |
we could fit all of ia's audio on a HD |
|
06:45
🔗
|
godane |
maybe usb stick |
|
06:45
🔗
|
underscor |
lol |
|
06:46
🔗
|
godane |
even though i think IA will still need a big fat usb stick even at that compress |
|
06:46
🔗
|
underscor |
yay, 12 tasks from running |
|
06:47
🔗
|
underscor |
almost there |
|
06:47
🔗
|
godane |
same problem: http://archive.org/details/20040229-bbs-schinnell |
|
06:47
🔗
|
godane |
http://archive.org/details/20040128-bbs-willing |
|
06:47
🔗
|
underscor |
Running |
|
06:47
🔗
|
underscor |
http://www.us.archive.org/log_show.php?task_id=110466442 |
|
06:48
🔗
|
underscor |
Cross your fingers |
|
06:48
🔗
|
Coderjoe |
i was tempted to try mashing a button I can see for the item |
|
06:48
🔗
|
godane |
http://archive.org/details/22020525-bbs-milkyliz3 |
|
06:49
🔗
|
underscor |
it looks like the source file may be corrupt |
|
06:49
🔗
|
underscor |
but we'll see |
|
06:50
🔗
|
Coderjoe |
does the files.xml info match the file? |
|
06:50
🔗
|
underscor |
even if it is, our deriver is much better at recovering from shit than it was 5 years ago |
|
06:50
🔗
|
underscor |
ffmpeg improvements, etc |
|
06:50
🔗
|
godane |
i hope not |
|
06:50
🔗
|
underscor |
hm? |
|
06:51
🔗
|
godane |
jason scott has not update anything in the bbs docs for 5 years |
|
06:51
🔗
|
godane |
one thing if all interviews was there |
|
06:51
🔗
|
godane |
but there not |
|
06:51
🔗
|
godane |
i just fear the full intervews will be lost |
|
06:51
🔗
|
godane |
thats all |
|
06:53
🔗
|
underscor |
https://internetarchive.etherpad.mozilla.org/5 |
|
06:53
🔗
|
underscor |
Please list bad identifiers there |
|
06:53
🔗
|
underscor |
IDENTIFIERS ONLY |
|
06:54
🔗
|
underscor |
I will trigger a redrive |
|
06:54
🔗
|
Coderjoe |
this derive seems to be taking longer than the last one initiated by tracey, which is almost certainly a good thing. |
|
06:54
🔗
|
underscor |
absolutely |
|
06:54
🔗
|
Coderjoe |
(for the thumbnail generation step alone) |
|
06:55
🔗
|
underscor |
It's also much bigger than the TV she's deriving |
|
06:55
🔗
|
underscor |
so it makes sense |
|
06:55
🔗
|
Coderjoe |
no, I meant the last derive on this interview item |
|
06:56
🔗
|
Coderjoe |
back on one of the (now apparently defunct) ia3* nodes |
|
06:56
🔗
|
underscor |
oic |
|
06:56
🔗
|
underscor |
yes, ia3* is (was?) the thumper farm |
|
06:57
🔗
|
chronomex |
thumper? |
|
06:57
🔗
|
underscor |
http://archive.org/images/petabox-via.jpg |
|
06:57
🔗
|
underscor |
^ thumpers |
|
06:57
🔗
|
underscor |
they still exist/are in service, but repurposed |
|
06:57
🔗
|
underscor |
godane: Identifiers only please in that list |
|
06:57
🔗
|
underscor |
so not archive.org/details/blah |
|
06:57
🔗
|
underscor |
just blah |
|
06:57
🔗
|
Coderjoe |
I know of the old via red boxes |
|
06:58
🔗
|
chronomex |
oh. thumper. |
|
06:58
🔗
|
ersi |
thump thump |
|
06:59
🔗
|
Coderjoe |
thumbnail stage complete. thumbnail on the detail page looks a heck of a lot better |
|
06:59
🔗
|
underscor |
I actually have our "display" rack right next to me |
|
07:00
🔗
|
underscor |
guarded by sharkive |
|
07:00
🔗
|
underscor |
http://i.imgur.com/FizaK.jpg |
|
07:00
🔗
|
underscor |
(which is our remote controlled mylar balloon) |
|
07:01
🔗
|
chronomex |
nice photo |
|
07:01
🔗
|
underscor |
haha |
|
07:01
🔗
|
underscor |
it's my shit camera phone |
|
07:02
🔗
|
chronomex |
it could do with some aiming next time |
|
07:02
🔗
|
underscor |
I was taking it blind |
|
07:02
🔗
|
underscor |
hard to get a good angle |
|
07:03
🔗
|
godane |
i believe i got all of them |
|
07:03
🔗
|
underscor |
ok |
|
07:03
🔗
|
godane |
these only have mp3 problems |
|
07:03
🔗
|
underscor |
kicking it off shortly |
|
07:03
🔗
|
underscor |
All of them only need the mp3s rederived? |
|
07:03
🔗
|
underscor |
(cause I can set that) |
|
07:04
🔗
|
godane |
also i got this i want to add to shareware cds: http://archive.org/details/cdrom-3d-world-119 |
|
07:04
🔗
|
underscor |
I can't do that |
|
07:04
🔗
|
godane |
and this: http://archive.org/details/cdrom-3d-world-150 |
|
07:04
🔗
|
underscor |
jason will be mad |
|
07:04
🔗
|
godane |
ok |
|
07:04
🔗
|
underscor |
queuing the derives is something within my scope |
|
07:04
🔗
|
godane |
ok |
|
07:04
🔗
|
godane |
sorry |
|
07:05
🔗
|
underscor |
hacking the permissions system, while possible, is not in my job description |
|
07:05
🔗
|
underscor |
and the last thing I want is more talkings-to from SketchCow |
|
07:05
🔗
|
underscor |
np :) |
|
07:05
🔗
|
underscor |
just want you to know why |
|
07:05
🔗
|
chronomex |
oh really, I bet I could come up with stuff that you're less interested in |
|
07:06
🔗
|
godane |
i maybe getting a bigger hard drive soon |
|
07:06
🔗
|
underscor |
The derive thing isn't really a circumvention. Moving collections is me going in and changing items and manually updating mysql tables. Lots of room for error and fuckery, so better to let a superadmin take care of it |
|
07:06
🔗
|
godane |
i hope |
|
07:06
🔗
|
godane |
ok |
|
07:06
🔗
|
underscor |
(where they have access to the easy web page that just works (tm) |
|
07:06
🔗
|
underscor |
) |
|
07:06
🔗
|
chronomex |
plus, circumventing perms would not create the correct audit trail |
|
07:06
🔗
|
underscor |
^ |
|
07:06
🔗
|
chronomex |
all kinds of wrong |
|
07:07
🔗
|
underscor |
which is why I'm not doing it. I'm slowly learning. |
|
07:07
🔗
|
godane |
me too |
|
07:07
🔗
|
underscor |
SketchCow's lessons aren't all going to waste |
|
07:07
🔗
|
chronomex |
that's reassuring |
|
07:07
🔗
|
godane |
of course i'm backing up like very thing |
|
07:07
🔗
|
underscor |
fuck it, doing a full redrive of all those items |
|
07:07
🔗
|
underscor |
The other derivatives may be fucked |
|
07:07
🔗
|
underscor |
and I don't feel like dealing with it |
|
07:07
🔗
|
godane |
agree |
|
07:08
🔗
|
chronomex |
sounds reasonable |
|
07:08
🔗
|
godane |
but i played with most of them and most of the videos are fine |
|
07:08
🔗
|
underscor |
So I have a script that's supposed to list the statuses of torrents in deluge |
|
07:08
🔗
|
underscor |
and this is what it just gave me |
|
07:08
🔗
|
underscor |
1 Jeremy |
|
07:08
🔗
|
underscor |
325 Seeding |
|
07:08
🔗
|
underscor |
63 Downloading |
|
07:08
🔗
|
underscor |
hahahahaha |
|
07:09
🔗
|
underscor |
something's obviously borked there |
|
07:10
🔗
|
underscor |
ok, rederives queued |
|
07:10
🔗
|
underscor |
Priority 0 tasks |
|
07:10
🔗
|
underscor |
So they should start immediately. |
|
07:10
🔗
|
underscor |
All running. |
|
07:11
🔗
|
underscor |
holy fuck |
|
07:11
🔗
|
underscor |
http://archive.org/details/diggnation |
|
07:11
🔗
|
underscor |
that should be a collection with each video in an item |
|
07:11
🔗
|
underscor |
goddamn, Famicoman hehehe |
|
07:11
🔗
|
underscor |
http://archive.org/catalog.php?history=1&identifier=diggnation |
|
07:11
🔗
|
underscor |
21 day derive hahahahaha |
|
07:12
🔗
|
underscor |
that poor worker |
|
07:16
🔗
|
ersi |
underscor: So you got one torrent which is Jeremying? |
|
07:16
🔗
|
ersi |
Pretty awesome if you ask me |
|
07:16
🔗
|
underscor |
:D |
|
07:17
🔗
|
underscor |
and on the other host |
|
07:17
🔗
|
underscor |
1 Holy |
|
07:17
🔗
|
underscor |
51 Seeding |
|
07:17
🔗
|
underscor |
701 Downloading |
|
07:17
🔗
|
underscor |
must be a copy of the bible or something |
|
07:21
🔗
|
godane |
underscor: Thats what Famicoman does |
|
07:21
🔗
|
godane |
its meant to make it one stop shop |
|
07:22
🔗
|
underscor |
yeah, but it breaks our system |
|
07:22
🔗
|
Coderjoe |
i was told to do one item per video for the stage6 collection |
|
07:22
🔗
|
Coderjoe |
most likely for this (and related) reason(s) |
|
07:23
🔗
|
Coderjoe |
plus, darking one video is easier when each is a separate item |
|
07:23
🔗
|
godane |
some of the rev3 stuff it was ok i think |
|
07:23
🔗
|
underscor |
I dunno |
|
07:23
🔗
|
underscor |
Not my call |
|
07:23
🔗
|
godane |
like unboxing porn |
|
07:23
🔗
|
godane |
http://archive.org/details/unboxingporn |
|
07:23
🔗
|
underscor |
But I'll bring it up tommorrow when alexis is back. |
|
07:24
🔗
|
godane |
i would have perfer diggnation by year |
|
07:24
🔗
|
godane |
just so it would be easyier on archive.org |
|
07:26
🔗
|
Coderjoe |
sweet jesus that item is wrong on so many levels. on the plus side, it shouldn't have a need to add new files to it (and thus possibly triggering another couple month long derive) |
|
07:27
🔗
|
Coderjoe |
(the diggnation one is the one I am talking about) |
|
07:28
🔗
|
Coderjoe |
over 400 episodes in one item. I don't think the system was intended to work like this. |
|
07:29
🔗
|
Coderjoe |
and I don't feel like writing the xml parsing crap I would need to do in order to sum up the original file sizes. I suspect this item is much larger than desired, even before derivatives |
|
07:30
🔗
|
Coderjoe |
woopwoop? |
|
07:30
🔗
|
underscor |
size: 236,398,442 KB |
|
07:30
🔗
|
underscor |
yowch |
|
07:32
🔗
|
godane |
i'm not doing it with dl.tv and crankygeeks luckly |
|
07:32
🔗
|
Coderjoe |
oh yeah. I forgot about that method |
|
07:32
🔗
|
godane |
dl.tv is like 50gb of video |
|
07:32
🔗
|
godane |
crankgeeks is like 30gb |
|
07:37
🔗
|
underscor |
I am off to bed |
|
07:38
🔗
|
underscor |
please PM me if any of those derives explode |
|
07:38
🔗
|
underscor |
rather |
|
07:38
🔗
|
underscor |
email me |
|
07:38
🔗
|
underscor |
abuie@archive.org |
|
07:38
🔗
|
Coderjoe |
oh god! a derive exploded. I have pieces of mpeg2 file stuck in my leg! |
|
07:38
🔗
|
mutoso |
xD |
|
07:38
🔗
|
Coderjoe |
s/pieces/bits/ ? |
|
07:39
🔗
|
mutoso |
Yeah, better. |
|
07:39
🔗
|
Coderjoe |
sleep sounds like an excellent idea |
|
07:40
🔗
|
underscor |
have to get up early tomorrow |
|
07:40
🔗
|
underscor |
going to redwood city with mario to work on the ia7* datacenter |
|
07:40
🔗
|
underscor |
replace disks, rack new hardware, install switches, etc |
|
07:40
🔗
|
underscor |
\o/ |
|
07:40
🔗
|
Coderjoe |
sounds fun |
|
07:41
🔗
|
underscor |
For me it is. |
|
07:41
🔗
|
underscor |
Probably not a lot of people |
|
07:41
🔗
|
underscor |
haha |
|
07:43
🔗
|
chronomex |
wtf, rsync, why aren't you happy |
|
07:43
🔗
|
chronomex |
it found and skipped the already transferred gigabytes, but seems to have hung |
|
07:44
🔗
|
chronomex |
(this is for yet another multigigabyte memac user) |
|
07:44
🔗
|
Deewiant |
It should continue at some point, I don't know what it's doing there |
|
07:44
🔗
|
Deewiant |
It's happened to me, too; eventually they finished |
|
07:45
🔗
|
chronomex |
eventually = ? |
|
07:46
🔗
|
Deewiant |
By the next day? :-P Dunno, didn't watch them, just noticed them getting stuck |
|
07:47
🔗
|
chronomex |
well, shit, okay, I'll just have to deal with my connection not getting fucked |
|
07:47
🔗
|
chronomex |
hah |
|
07:47
🔗
|
Deewiant |
It might only be some minutes, I really didn't pay attention |
|
07:47
🔗
|
chronomex |
ah, there it goes |
|
07:53
🔗
|
omf_ |
I am interested in helping out with the fanfiction.net archiving. The other channel is dead right now. |
|
07:56
🔗
|
bsmith093 |
omf_: we're pretty much done with that for now. there's presumably some project to do a continuous scrape to stay on top of it, but, we've probably got everything before the purge hit |
|
07:56
🔗
|
bsmith093 |
stupid admins finally decided to enforce a long ignored content rating rule, and start deleting stoires |
|
07:57
🔗
|
omf_ |
aah |
|
07:57
🔗
|
bsmith093 |
to clarify, THE SITE ITSELF IS NOT GOING ANYWHERE!!! anytime soon, that I know of, I just thought it would be a good idea to grab it, cause it's the biggest |
|
07:58
🔗
|
bsmith093 |
at the time i wasnt even aware of the upcoming purge |
|
07:58
🔗
|
omf_ |
yeah I was interested in it to do some nlp work |
|
07:58
🔗
|
bsmith093 |
nlp? |
|
07:58
🔗
|
omf_ |
natural language processing |
|
07:58
🔗
|
omf_ |
it is hard to find large chunks of modern text |
|
07:58
🔗
|
bsmith093 |
ooooooh, Google would love that! |
|
07:58
🔗
|
bsmith093 |
porjoect gutenberg, and wikipedia work well? |
|
07:59
🔗
|
chronomex |
pg is not really modern |
|
07:59
🔗
|
omf_ |
project gutenberg has a format consistency problem |
|
07:59
🔗
|
omf_ |
and wikipedia is fine for non-fiction writing |
|
08:00
🔗
|
omf_ |
Well onto my next idea. I am going to update the deathwatch page for berlios.de and delicious |
|
08:00
🔗
|
bsmith093 |
ive got dozens of stories in a consistent format downloaded off fanfiction.net for my own library, by a tool i got from google code |
|
08:01
🔗
|
bsmith093 |
also you might want to add dead dying damned to the dead column :) |
|
08:01
🔗
|
omf_ |
berlios got saved and the new delicious does not have any of the old content |
|
08:03
🔗
|
bsmith093 |
anyway i have 1705 stories, saved in text format, ( _italics_ *bold*) with metadata at the beginning of each file, 281mb if you want it |
|
08:03
🔗
|
bsmith093 |
the archive we saved will be up soon, presumably, ask SketchCow |
|
08:08
🔗
|
Cameron_D |
I'm unable to edit wiki pages :/ "Call to undefined method Article::getSection()" |
|
08:08
🔗
|
ersi |
That's a feature, not a bug |
|
08:09
🔗
|
ersi |
nah, but SketchCow has said the wiki needs some techlove ^ |
|
08:10
🔗
|
Cameron_D |
Well I suppose the feature stops the bots |
|
08:45
🔗
|
omf_ |
Are there any criteria for adding sites to "dead as a doornail" other than it being dead |
|
08:47
🔗
|
godane |
uploading screen savers from may 2004 |
|
09:01
🔗
|
omf_ |
So what projects need help? The apple one sounds like it is almost complete |
|
09:09
🔗
|
chronomex |
omf_: well, it's probably best to stick to sites that archiveteam has or would have archived. |
|
09:09
🔗
|
chronomex |
re: the wiki |
|
09:10
🔗
|
omf_ |
chronomex, that is what I mean. I only have one example gameart.org It had almost 10 years worth of gaming art |
|
09:10
🔗
|
omf_ |
then it went down |
|
09:11
🔗
|
chronomex |
ok |
|
09:11
🔗
|
chronomex |
/me zzzz |
|
09:13
🔗
|
omf_ |
I have been getting more and more into big data projects so I am excited at what is out there. |
|
09:29
🔗
|
omf_ |
I saw a note on the site about doing comparisons to remove duplicate images from the geocities data. Anyone know what the status of that is? |
|
09:37
🔗
|
omf_ |
Also the tracker for FortuneCity is offline so I cannot find out the status of the archiving that took place before it closed earlier this year. |
|
09:44
🔗
|
ersi |
omf_: What information were you looking for? At the FoCity tracker I mean |
|
10:22
🔗
|
omf_ |
ersi, just to see if the project got completed or things were missed. I had a few sets of pages on there I wouldn't mind seeing again |
|
10:29
🔗
|
ersi |
omf_: AFAIK all of the users/pages that we were able to crawl were downloaded. That does however not say anything about the completeness in general I guess. |
|
10:30
🔗
|
ersi |
alard: Know if we got the dataset of all user/url's left anywhere? |
|
10:36
🔗
|
alard |
http://archive.org/details/archiveteam-fortunecity-list |
|
10:37
🔗
|
alard |
and http://archive.org/details/archiveteam-fortunecity |
|
10:38
🔗
|
alard |
omf_/ersi: and http://archive.org/download/test-memac-index-test/fortunecity.html |
|
10:40
🔗
|
omf_ |
that is cool |
|
10:42
🔗
|
ersi |
Ah, yeah - that was what I was looking for. |
|
10:42
🔗
|
alard |
Good things come to those who search. :) |
|
10:42
🔗
|
ersi |
:D |
|
11:24
🔗
|
omf_ |
Is there a mailing list to follow or is an ear to the irc channel and an eye on the wiki the way to go. |
|
11:26
🔗
|
Famicoman |
underscor, you can kill the derive if that would help. |
|
11:33
🔗
|
ersi |
omf_: No mailing list, this is where the magic happens |
|
11:34
🔗
|
ersi |
and in the sub-channels of course (like #wikiteam, #urlteam) and project specific channels |
|
12:34
🔗
|
omf_ |
Is there a sub-group for maintaining the wiki |
|
13:14
🔗
|
fLoo |
hi all |
|
13:14
🔗
|
fLoo |
just found your project and i love it |
|
13:14
🔗
|
fLoo |
:) |
|
13:21
🔗
|
fLoo |
is there a tool for windows available too ? |
|
13:22
🔗
|
fLoo |
so i can integrate some workstations in the process ? |
|
13:45
🔗
|
ersi |
fLoo: Yeah, there's the "Archiveteam Warrior" - which is a Virtual Machine you boot up and can help out with projects |
|
13:45
🔗
|
ersi |
Link to "AT-warrior" is in /topic |
|
13:46
🔗
|
fLoo |
already found it, thanks |
|
13:46
🔗
|
fLoo |
contributing 12 gbit now |
|
13:46
🔗
|
fLoo |
hope that helps |
|
13:48
🔗
|
ersi |
Whoa man |
|
13:48
🔗
|
ersi |
You at some university or something? |
|
13:49
🔗
|
fLoo |
nope |
|
13:49
🔗
|
fLoo |
datacenter |
|
13:49
🔗
|
fLoo |
got some nice machines here |
|
13:52
🔗
|
omf_ |
is anyone working on the usenet history dump? |
|
13:52
🔗
|
omf_ |
currently |
|
13:55
🔗
|
omf_ |
I just remembered about this: http://www.psych.ualberta.ca/~westburylab/downloads/usenetcorpus.download.html |
|
13:56
🔗
|
omf_ |
which is 28M posts were collected from 2005-2009 |
|
13:56
🔗
|
fLoo |
is there a way to check how many packages at all + gigabytes i uploaded ? |
|
13:57
🔗
|
omf_ |
it is 40gb |
|
14:06
🔗
|
fLoo |
mmmh |
|
14:06
🔗
|
fLoo |
i dont find anything to see how much i contributed |
|
14:06
🔗
|
fLoo |
:( |
|
14:08
🔗
|
fLoo |
gogo guys |
|
14:17
🔗
|
fLoo |
ersi: http://memac.heroku.com/ looks good |
|
14:17
🔗
|
fLoo |
:) |
|
14:21
🔗
|
DFJustin |
hmm looks like they stripped a lot of stuff such that it's useful as a corpus but not so much for history |
|
14:21
🔗
|
mistym |
DFJustin: What's that? |
|
14:22
🔗
|
DFJustin |
<omf_> I just remembered about this: http://www.psych.ualberta.ca/~westburylab/downloads/usenetcorpus.download.html |
|
14:22
🔗
|
omf_ |
It is a starting point for that time |
|
14:22
🔗
|
mistym |
Thanks |
|
14:22
🔗
|
omf_ |
the old archives were easier to find |
|
14:23
🔗
|
omf_ |
it is going to be the more recent posts that are going to be the problem |
|
14:23
🔗
|
omf_ |
why did google have to buyout dejanews |
|
14:33
🔗
|
omf_ |
over 700 million posts in google groups. Subtract 1981-1991 and the last 5 years |
|
14:33
🔗
|
omf_ |
and then consider that not every post is a thread. |
|
14:34
🔗
|
omf_ |
it is still going to take some clever work |
|
14:46
🔗
|
mistym |
https://twitter.com/archiveteam/status/212618778968731649 So hey, what's this about? |
|
14:54
🔗
|
omf_ |
I found someone else trying to do a usenet archive like google. That at least easies the field a little |
|
15:18
🔗
|
DFJustin |
http://www.fortunecity.ws/ |
|
15:32
🔗
|
Schbirid |
http://www.heise.de/newsticker/meldung/Freiwillige-legen-Archiv-oeffentlicher-MobileMe-Daten-an-1626844.html |
|
15:33
🔗
|
ersi |
cool |
|
15:33
🔗
|
Schbirid |
the comments are full of angry trolling about how private data should be forbidden to be used "like this" |
|
15:33
🔗
|
Schbirid |
:D |
|
15:34
🔗
|
mistym |
You keep using that word. I do not think it means what you think it means. |
|
15:35
🔗
|
ersi |
Yeah yeah, people can whine as much as they want - what matters is that the data is not totally gone though :) |
|
15:35
🔗
|
Schbirid |
mistym: me? what word? |
|
15:36
🔗
|
DFJustin |
it's not trolling if they genuinely believe it |
|
15:36
🔗
|
mistym |
Schbirid: It's a movie quote. :V Was talking about "private" |
|
15:36
🔗
|
Schbirid |
right |
|
15:37
🔗
|
ersi |
DFJustin: What about fortunecity.ws? >_> |
|
15:38
🔗
|
fLoo |
Schbirid |
|
15:38
🔗
|
fLoo |
due to this article i came here |
|
15:38
🔗
|
fLoo |
and now i'm contributing ;) |
|
15:38
🔗
|
Schbirid |
sweet |
|
15:38
🔗
|
Schbirid |
welcome |
|
15:38
🔗
|
fLoo |
thanks |
|
15:39
🔗
|
DFJustin |
just noting its existence |
|
15:40
🔗
|
ersi |
DFJustin: ah, alright |
|
15:44
🔗
|
fLoo |
guys |
|
15:44
🔗
|
fLoo |
i just saw that there is a seesaw-s3 script |
|
15:44
🔗
|
fLoo |
which is perfect for my bandwidth |
|
15:44
🔗
|
fLoo |
but it requires access-tokens |
|
15:44
🔗
|
fLoo |
howto acquire them ? |
|
15:45
🔗
|
Schbirid |
fLoo: i asked about that a while ago and was told it was not worth using at the moment |
|
15:45
🔗
|
fLoo |
ok |
|
15:45
🔗
|
fLoo |
currently i run 20 instances of seesaw per machine |
|
15:45
🔗
|
fLoo |
still only 25 % bw used |
|
15:45
🔗
|
fLoo |
:( |
|
15:46
🔗
|
Schbirid |
wel, stick around. the next bandwidth eating project will come |
|
15:46
🔗
|
alard |
fLoo: The seesaw-s3 script doesn't download any faster, it's just the upload that's different. |
|
15:46
🔗
|
fLoo |
ok, then its np |
|
15:46
🔗
|
alard |
The problem with the normal seesaw script is that everything ends up on one machine, which fills up. |
|
15:46
🔗
|
fLoo |
uploading is fine here with ~ 60-70 mbit per connection |
|
15:46
🔗
|
alard |
The seesaw-s3 script uploads directly to archive.org, so that was very helpful for the bulk downloaders. |
|
15:47
🔗
|
fLoo |
i understand, thanks for the information |
|
15:47
🔗
|
alard |
Thank's for helping! |
|
15:47
🔗
|
fLoo |
its a cool project |
|
15:47
🔗
|
fLoo |
:) |
|
15:47
🔗
|
fLoo |
is there a way to get my own statistics ? |
|
15:48
🔗
|
fLoo |
i mean my complete contribution-stats ? |
|
15:48
🔗
|
alard |
http://memac.heroku.com/ |
|
15:48
🔗
|
fLoo |
yea, but howto query for a single user ? |
|
15:48
🔗
|
fLoo |
AHHH |
|
15:49
🔗
|
fLoo |
there is a freaking '+' button |
|
15:49
🔗
|
fLoo |
seriously .. couldnt see it |
|
15:52
🔗
|
alard |
fortunecity.ws is a bit strange: pages full of disclaimers and pseudo-legal texts, but nothing at all about the nature of the site, the source of the data etc. |
|
15:54
🔗
|
ersi |
Indeed |
|
16:13
🔗
|
fLoo |
schbirid what do you use to crawl ? |
|
16:13
🔗
|
fLoo |
i dont think its your hansenet account ;) |
|
16:14
🔗
|
Schbirid |
ovh server(s) |
|
16:14
🔗
|
fLoo |
just saw it |
|
16:15
🔗
|
fLoo |
another question: why does archiveteam announce that we're finished with the project |
|
16:15
🔗
|
fLoo |
but there are still 20k missing |
|
16:15
🔗
|
Schbirid |
where was that announced? |
|
16:17
🔗
|
fLoo |
heise etc |
|
16:17
🔗
|
fLoo |
twitter |
|
16:17
🔗
|
DFJustin |
I think the deal is we've been though them all once now and this is the stuff that didn't work right on the first pass |
|
16:17
🔗
|
Schbirid |
oh, SketchCow tweeted that indeed https://twitter.com/archiveteam/status/217665111895191552 |
|
16:18
🔗
|
Schbirid |
fLoo: do you have old pc/game mag cover discs? rip them! |
|
16:19
🔗
|
DFJustin |
for example stuff was added to the script to handle infinitely recursive folders |
|
16:19
🔗
|
fLoo |
Schbirid: lol why ? |
|
16:20
🔗
|
Schbirid |
fLoo: to archive them at archive.org |
|
16:20
🔗
|
fLoo |
gonna see what i still got |
|
16:20
🔗
|
mistym |
So hey, speaking of tweets, who was this / what is this about? https://twitter.com/archiveteam/status/212618778968731649 |
|
16:20
🔗
|
fLoo |
playstation 1 games too ? |
|
16:21
🔗
|
DFJustin |
fLoo: one of our side projects is http://archive.org/details/cdbbsarchive |
|
16:22
🔗
|
DFJustin |
old magazine and shareware discs are a great source of stuff that isn't available online anymore |
|
16:22
🔗
|
Schbirid |
yeah |
|
16:23
🔗
|
DFJustin |
playstation I'm not sure because I think they need to be ripped using special procedures to be correct |
|
16:24
🔗
|
mistym |
Playstation is not exactly endangered material either. Shareware is much more ephemeral. |
|
16:26
🔗
|
Schbirid |
playstation demo discs had weird demos on them |
|
16:29
🔗
|
mistym |
Oh yeah, demo discs, that's true. |
|
16:30
🔗
|
mistym |
I know a guy who spent years tracking down a Japanese demo disc with the only playable copy of an obscure cancelled game he was obsessed with. |
|
16:30
🔗
|
Nemo_bis |
did he find it? |
|
16:30
🔗
|
mistym |
He did! |
|
16:30
🔗
|
Nemo_bis |
so it's already on archive.org isn't it? :) |
|
16:30
🔗
|
mistym |
Turns out: pretty good game. Even the demo is very incomplete though. |
|
16:31
🔗
|
mistym |
Amazingly - no. It is not. |
|
16:31
🔗
|
mistym |
Maybe it should be? |
|
16:31
🔗
|
Nemo_bis |
very very bad |
|
16:31
🔗
|
Nemo_bis |
of course it must |
|
16:32
🔗
|
DFJustin |
I put a whole bunch of sega saturn demo discs in there but I haven't looked at other systems yet |
|
16:37
🔗
|
mistym |
saturn, yay! |
|
16:39
🔗
|
DFJustin |
http://archive.org/search.php?query=sega%20saturn%20AND%20collection%3Acdbbsarchive |
|
16:39
🔗
|
DFJustin |
there's apparently a couple hundred that were made though |
|
16:40
🔗
|
mistym |
I wonder if the Panzer Azel is like the UK magazine demo, which was just the whole first disc |
|
17:46
🔗
|
Famicoman |
I should learn how to burn saturn and sega cd discs |
|
18:18
🔗
|
DFJustin |
I don't think they have any copy protection |
|
18:23
🔗
|
mistym |
Famicoman, DFJustin: Saturn has copy protection, need a swap trick or modded system to play. Sega CD has no protection |
|
18:52
🔗
|
godane |
may 2004 screen savers is uploaded: http://archive.org/details/TechTV_TSS_2004_05_Full_Episodes |
|
19:41
🔗
|
fLoo |
mmh |
|
19:41
🔗
|
fLoo |
why isnt the list updated ? |
|
21:58
🔗
|
SketchCow |
Morning. |
|
21:58
🔗
|
SketchCow |
Or afternoon. |
|
21:58
🔗
|
SketchCow |
Who rocked the opening keynote? This guy. |
|
21:58
🔗
|
SketchCow |
What keynote video becomes available this evening? That one. |
|
21:58
🔗
|
SketchCow |
The guy recording the keynote videos is amazing. |
|
21:58
🔗
|
chronomex |
fuqyea, you talked at google IO! |
|
22:06
🔗
|
SketchCow |
https://www.youtube.com/watch?v=_OczqFEcUTA#t=6m1s |
|
22:09
🔗
|
Nemo_bis |
fLoo, what list? |