#archiveteam 2012-05-05,Sat

↑back Search

Time Nickname Message
02:09 πŸ”— Wyatt|Wor Found another mobileme user recursing on homepage.mac.com
02:11 πŸ”— Wyatt|Wor Username is tauran; haven't looked at why; busy with the CGI thing.
02:41 πŸ”— Zuu- Hello
02:42 πŸ”— Zuu- Was someone here archiving revision 3 shows?
04:59 πŸ”— Coderjoe mm
05:00 πŸ”— Coderjoe a bunch of curl connection failures. i suspect either my network or the s3 endpoint hiccupped
05:35 πŸ”— chronomex christ, mobileme user has been uploading for a day now
05:47 πŸ”— Wyatt|Wor chronomex: How big is it?
05:47 πŸ”— Wyatt|Wor (And it's not something that got into a big loop or mirrored some other users?)
05:48 πŸ”— chronomex -rw-r--r-- 1 duncan duncan 2.3G Nov 3 11:48 data/c/cr/cra/craig.schmidt/public.me.com/public.me.com-craig.schmidt.warc.gz
05:49 πŸ”— Wyatt|Wor Ah, I see. Sorry, thinking in terms of 100Mb connections. ^^;;
11:09 πŸ”— Schbirid shaqfu: ah, shame. :)
11:09 πŸ”— Schbirid runs pretty well here
11:11 πŸ”— Schbirid how could we get the fileplanet stuff nicely to archive.org? i guess it would be 100k+ items.
11:11 πŸ”— Schbirid so it might be a bad idea to upoad them individually
11:22 πŸ”— Schbirid so far i have an average of <3MB per item. but then i am still <50000 and they will just get bigger and bigger
12:07 πŸ”— Schbirid http://www.gamefront.com/breaking-ign-to-close-fileplanet/
12:07 πŸ”— Schbirid "We have decided to archive FilePlanet and will eventually stop operating the site"
12:07 πŸ”— Schbirid so "archived" is a misleading term
12:08 πŸ”— Schbirid Γ’Β€ΒœWhile the site will no longer be updated,Ҁ IGN told us, Γ’Β€Βœfor now users can still use the site as a repository of file content. If/when we remove all site content completely, weҀ™ll be sure to communicate that to users before it happens.Ҁ
12:29 πŸ”— Nemo_bis Schbirid, why will they get bigger and bigger? you're downloading them chronologically an recent files are bigger?
12:29 πŸ”— Nemo_bis 10k files per item should be ok anyway
12:31 πŸ”— Nemo_bis soo, I'm at about 1700 wikis downloaded for #wikiteam, but nobody is working on the uploading script
12:32 πŸ”— Nemo_bis aka https://code.google.com/p/wikiteam/source/browse/trunk/uploader.py
12:38 πŸ”— Schbirid yeah, i go IDs upwards and that means later files are bigger (guessing but i am 100% sure)
12:39 πŸ”— Nemo_bis yep
12:40 πŸ”— Schbirid ok, 10k parts sounds like a good idea
12:40 πŸ”— Nemo_bis tarred maybe
12:41 πŸ”— SketchCow Hey, people.
12:41 πŸ”— SketchCow Anything I need to know about?
12:42 πŸ”— Nemo_bis That I'm eager to see ISO images flooding archive.org?
12:42 πŸ”— Schbirid tar would mean that the files would not be accessible easily
12:43 πŸ”— Nemo_bis Well, if the tar archive is under 5-10 GB there's the tar viewer.
12:43 πŸ”— Nemo_bis But if you have to load an item description with 10k elements the HTML will take forever.
12:43 πŸ”— Schbirid tar viewer! i never heard of that
12:43 πŸ”— Schbirid oh, that is true
12:43 πŸ”— Nemo_bis Probably it's better to download everything and then ask SketchCow with real data
12:44 πŸ”— Nemo_bis take the /download link and add a / at the end of the URL
12:44 πŸ”— Schbirid is that indexed by crawlers?
12:45 πŸ”— Nemo_bis only if you put links I guess; or maybe not even in that case because there's nofollow even for internal links?
12:45 πŸ”— Nemo_bis anyway, for instance: http://archive.org/download/mobileme-hero-1335947007/mobileme-full-1335947007.tar/
12:46 πŸ”— dnova that is awesome
12:46 πŸ”— Schbirid nice
12:47 πŸ”— Coderjoe ugh
12:47 πŸ”— Coderjoe I fear I am uploading a bunch of tv show episodes to IA now >_<
12:47 πŸ”— SketchCow I do agree that fos is getting a little slow.
12:48 πŸ”— SketchCow Not sure why.
12:57 πŸ”— SketchCow I'm cleaning up uploaded mobileme sets right now.
12:57 πŸ”— Nemo_bis wow
12:57 πŸ”— Nemo_bis what about splinder?
12:57 πŸ”— SketchCow WHat about it?
12:57 πŸ”— Nemo_bis is fos slowed down by the splinder tidying up^
12:57 πŸ”— SketchCow No, no.
12:57 πŸ”— Nemo_bis or did it finish
12:57 πŸ”— SketchCow It's at a halt point, has been.
13:03 πŸ”— SketchCow Just verified and removed 1.7tb of mobileme from the machine.
13:04 πŸ”— SketchCow Another 2tb is being uploaded now.
18:03 πŸ”— chronomex e.g. by me
18:07 πŸ”— emijrp do you know how to change the spotlight item (on the left sidebar)? http://archive.org/details/spanishrevolution
18:46 πŸ”— chronomex there's either a thing in the metadata for the collection or the item
18:47 πŸ”— chronomex I think the collectino
20:27 πŸ”— jojo56 hello everyone
20:38 πŸ”— Schbirid hi
20:39 πŸ”— jojo56 why
20:49 πŸ”— shaqfu ...why?
20:49 πŸ”— Schbirid WHY!
20:52 πŸ”— shaqfu Schbirid: Is there any useful metadata coming down with the FP files?
20:52 πŸ”— Schbirid yeah, i save the download page too
20:52 πŸ”— Schbirid and the url the file comes from
20:52 πŸ”— shaqfu Awesome
20:52 πŸ”— Schbirid eg http://www.fileplanet.com/224884/download
20:53 πŸ”— Schbirid has a full title "Gas Guzzlers: Combat Carnage Beta Client"
20:53 πŸ”— Schbirid and their category "Home / Gaming / RPG / Massively Multiplayer / Gas Guzzlers: Combat Carnage / Game Clients"
20:54 πŸ”— Schbirid perfect would be to save http://www.fileplanet.com/224884/220000/fileinfo/Gas-Guzzlers:-Combat-Carnage-Beta-Client too i guess. but i could not find a way to easily find those URLs so i just do the numeric id increments
20:54 πŸ”— Schbirid their older files have informative download urls like http://download.direct2drive.com/ftp2/bgchronicles/agportraits/vance/celeb.zip
20:55 πŸ”— shaqfu So long as the script grabs the page with basic metadata, it sohuld be good
20:55 πŸ”— Schbirid yeah
