#archiveteam 2012-05-07,Mon

↑back Search

Time Nickname Message
01:02 🔗 pronoiac Nuts. I was looking into a problem from yesterday, and the server now incorrectly believes that one's done - it's tauran, which Wyatt|Wor had problems with.
07:44 🔗 ersi Coderjoe: ah, well - I lost the conversation in the backlog so I just thought you were asking what it meant :)
07:56 🔗 Nemo_bis does the archive.org flash/javascript interface use chunked uploading?
11:58 🔗 Schbirid i can never remember how to redirect stderr to devnull
11:58 🔗 Schbirid 2>/dev/null
12:01 🔗 Schbirid https://github.com/SpiritQuaddicted/fileplanet-file-download/blob/master/download_pages_and_files_from_fileplanet.sh is much nicer now
12:12 🔗 Schbirid http://www.pastie.org/3867284
12:15 🔗 Schbirid working well
12:35 🔗 ersi Schbirid: that's not #AT bizniz
12:36 🔗 ersi IMO
12:46 🔗 underscor Schbirid: need more people to help download fileplanet?
12:46 🔗 Schbirid yes, definitely. i am just deciding on the final packaging then i would have asked
12:46 🔗 Schbirid how big can one archive.org item become?
12:48 🔗 ersi AFAIK it can be any size
12:49 🔗 ersi preferebly it should be smaller though
12:58 🔗 Schbirid argh, got a bug with 's
13:04 🔗 Schbirid i am too dumb to figure out how to remove the last character from a string in bash or gnu coreutils
13:07 🔗 Schbirid | rev | cut -c 2- | rev
13:07 🔗 Schbirid heh
13:07 🔗 Schbirid well, why not
13:37 🔗 underscor Schbirid: We aim for 10GB
13:37 🔗 underscor bigger than that and you can run into task issues, as there is only ~10GB guaranteed to be free on a datanode drive at any point
14:23 🔗 Schbirid hm, anyone able to download http://www.fileplanet.com/52249/download ?
14:24 🔗 Schbirid i always get a 403 forbidden
14:25 🔗 DFJustin same
14:26 🔗 Schbirid please refresh, check the source for the link (grep for default-file-download-link) and try pasting that into your address bar
14:27 🔗 DFJustin same
14:28 🔗 Schbirid cheers
14:28 🔗 Schbirid (i like how they have single quotes in filenames and use single quotes in their javascript as well)
14:29 🔗 DFJustin it's available at http://www.gamefront.com/files/13625/GRIST_MILLBY
14:30 🔗 Schbirid 50000-54999 is 24G already, ugh
15:16 🔗 Schbirid 20k-30k: ~7-8GB
15:16 🔗 Schbirid 30k-40k: 10GB
15:16 🔗 Schbirid 40k-50k: 18G
15:16 🔗 Schbirid 50k-55k: 25G
15:17 🔗 Schbirid i am scared. might mean that we'd need to do 1-2k increments. the end would be at 250k or something
15:17 🔗 Schbirid bbl
15:22 🔗 Nemo_bis Schbirid, put 5k files per item then
15:37 🔗 chronomex don't be scared!
16:36 🔗 codebear mobileme news: http://arstechnica.com/apple/news/2012/05/free-20gb-cloud-storage-for-mobileme-subscribers-extended-to-sept-30.ars
16:40 🔗 DFJustin http://archive.org/post/419499/chumbycom-is-going-away-request-for-archiving
16:44 🔗 yipdw oh neat
16:44 🔗 yipdw Github added organization display to user profiles
16:44 🔗 yipdw Archive Team needs a snazzy gravatar now
16:44 🔗 yipdw maybe we can reuse the unicorn one
16:46 🔗 yipdw (for those who don't know: http://archiveteam.org/images/0/05/Rejectedatlogo.jpg)
16:47 🔗 mistym I vote yes!
16:48 🔗 mistym I still wish Github let you follow organizations.
17:07 🔗 Schbirid alright, who wants to download some fileplanet!
17:07 🔗 Schbirid right now you will need to tar it manually in the end
17:08 🔗 Schbirid i guess we will just upload each tar seperately and have someone put them together into a collection?
17:18 🔗 Nemo_bis yes
17:18 🔗 * Nemo_bis is already downloading thousands of wikis
17:19 🔗 Schbirid ok
17:19 🔗 Schbirid https://github.com/SpiritQuaddicted/fileplanet-file-download/blob/master/download_pages_and_files_from_fileplanet.sh
17:20 🔗 Schbirid run: download_pages_and_files_from_fileplanet.sh 55000 59999
17:20 🔗 Schbirid will be about 30G i guess
17:28 🔗 Schbirid registering on the forums is still not possible? do we have a shared account i could use?
18:10 🔗 dnova Apple is extending its free storage offer to paid MobileMe subscribers from June 30 to September 30, 2012
18:11 🔗 yipdw http://venturebeat.com/2012/05/06/brazil-facebook-lies/ <-- fucking Black Mirror
18:16 🔗 yipdw also, props to whatever generated the URL slug, because it's totally apropos
18:46 🔗 Schbirid Nemo_bis: you running it?
18:47 🔗 Nemo_bis Schbirid, what?
18:48 🔗 Nemo_bis if you mean your script I'm not, as I said I'm already busy with wikis, load was at 60 a few min ago
18:48 🔗 Schbirid oh, i guess i misunderstood you
18:48 🔗 Nemo_bis 7 now but still no disk space :)
18:48 🔗 Schbirid heh
18:48 🔗 Schbirid nice
19:59 🔗 ersi The Swedish site http://www.resdagboken.se is closing down, from a press release by their owners (The large Norwegian(?) company/media conglomerate Schibsteds). The site is a "travel journey diary" for travelers, so it's mostly if not totally only user-made content
20:00 🔗 ersi Unsure if the content is going to be deleted, but.. if something's on it's deathbed, most likely. They've disabled the ability to create new users/logins as well as new 'journey diaries'. But existing diaries can be updated until 15 June 2012
20:04 🔗 ersi There's at least 17 million images and 2 million "journey diaries" from users according to their stats in the press release
20:05 🔗 shaqfu Think it'll be better to sweep through now, or wait for last call?
20:06 🔗 ersi not sure, but earlier is always better
20:07 🔗 shaqfu Might be better to wait a bit - it loooks like people are doing final entries now
20:09 🔗 ersi hm, true. But starting out finding users diaries and such might be good
20:19 🔗 shaqfu Doesn't look nicely structured, sadly
20:44 🔗 shaqfu http://archiveteam.org/index.php?title=Fileplanet
20:46 🔗 shaqfu Until account creation's back up, I'll probably give Schbird my credentials to keep the page count updated
20:47 🔗 shaqfu Or it might be better not to, so someone's keeping track of all the downloads
21:51 🔗 underscor If someone sees schibirid, tell him I have a list of all the valid ids
21:51 🔗 underscor it's much more efficient than brute-forcing every number between 1 and 220000
21:53 🔗 shaqfu underscor: Spiffy; how many are there?
21:53 🔗 underscor wc -l valid
21:53 🔗 underscor 87190 valid
21:53 🔗 * shaqfu updates the page
21:54 🔗 shaqfu Any estimates on size?
21:56 🔗 underscor nope
21:56 🔗 underscor I just extracted them from the sitemap XML files
21:56 🔗 shaqfu Gotcha
22:24 🔗 Nemo_bis underscor, email him?
23:56 🔗 SketchCow HUZZAH ARCHIVETEAM
23:56 🔗 dashcloud hello!
23:59 🔗 shaqfu Con go well?

irclogger-viewer