#archiveteam 2012-05-08,Tue

↑back Search

Time Nickname Message
00:02 πŸ”— SketchCow Yeah, I gave a good talk.
00:02 πŸ”— SketchCow And the con itself was good.
00:02 πŸ”— SketchCow I was pretty sick, though.
00:38 πŸ”— underscor SketchCow: was it recorded?
00:39 πŸ”— SketchCow All my talks are recorded.
00:39 πŸ”— SketchCow I'll have a copy up at some point.
00:39 πŸ”— SketchCow it's not a funny talk
00:44 πŸ”— shaqfu I'll catch it when it's up; the site doesn't even have a program anywhere right now
01:29 πŸ”— dnova are the mobileme grabs being stored on s3?
01:34 πŸ”— SketchCow Hard drives, mostly
01:34 πŸ”— SketchCow http://archive.org/details/mobileme-hero-1336159314 example
01:35 πŸ”— dnova ok
01:45 πŸ”— LordNlptp mobileme hero, 'zat some sort of video game where you smash files coming down 5 uplinks?
01:48 πŸ”— SketchCow Yes
02:19 πŸ”— godane looks like torrentfreak.com does a lot of 405
02:19 πŸ”— godane even when you can still download them thur normal wget
05:19 πŸ”— ersi shaqfu: wÀt, it's pretty logically built. /m/username and then /m/username/travel_diary/ ;o There's a "All diaries" list view in all user profiles, and a "All posts in diary" list view linked on all diary detail views
05:19 πŸ”— ersi going to look into it more today though
08:13 πŸ”— * SmileyG_ wibbles
08:13 πŸ”— SmileyG_ i left it running all weekend. Hopefully it got quite a bit done.
08:17 πŸ”— * SmileyG_ wibbles
08:58 πŸ”— ivan` $270/4TB http://www.amazon.com/Hitachi-Deskstar-0S03364-Frustration-Free-Packaging/dp/B005TEU2TQ/
12:04 πŸ”— Schbirid fileplanet is done until id 54999, http://www.quaddicted.com/forum/viewtopic.php?pid=251#p251
12:05 πŸ”— Nemo_bis Schbirid, did you get the id list form underscor ?
12:05 πŸ”— Schbirid if you want to help, just pick a ~5k increment (x0000-x4999 or x5000-x9999).
12:05 πŸ”— Schbirid nope
12:05 πŸ”— Schbirid not really needed though
12:06 πŸ”— Schbirid since my script tries each id. and if it finds one, it immediately has a valid download link available
12:07 πŸ”— Schbirid after finishing a part, put the pages_ and files_ logs plus the www.fileplanet.com/ into a directory named after your start and end IDs and tar that. eg 40000-49999.tar
12:07 πŸ”— Nemo_bis ehm, I'm still busy with wikis
12:07 πŸ”— Schbirid just posting this for anyone ;)
12:08 πŸ”— Schbirid hm, do i need to make an item before i can s3cmd add to it? or can i go full commandline?
12:08 πŸ”— Nemo_bis you can do everything from CL
12:09 πŸ”— Nemo_bis auto-make-bucket or so
12:09 πŸ”— Schbirid x-archive-auto-make-bucket:1
12:15 πŸ”— Schbirid uploading the first one
12:15 πŸ”— Schbirid with 1megabyte/s :)
12:18 πŸ”— Schbirid yeah, sweet. item creation worked. http://archive.org/details/FileplanetFiles_00000-09999 (still uploading)
12:30 πŸ”— Schbirid http://archive.org/details/FileplanetFiles_00000-09999 is done
12:40 πŸ”— SmileyG :O
12:40 πŸ”— SmileyG :D
13:22 πŸ”— Schbirid ugh, tar really could use some modernisation (i am a heretic)
13:23 πŸ”— Schbirid i seriously cannot make multivolume archives non-interactively (without a script)? :(
13:26 πŸ”— ersi you what
13:27 πŸ”— Schbirid i want to tar some files. i want to have not one big tar, but several smaller (to ease uploading).
13:29 πŸ”— ersi What supports that? zip? rar?
13:31 πŸ”— nobody1 use split
13:32 πŸ”— ersi Not really the same thing, since that can split in a header or some shizzle like that
13:32 πŸ”— ersi but it's certainly something to consider using instead
13:32 πŸ”— nobody1 tar something | split -b 2MB
13:32 πŸ”— nobody1 and then cat * > file.tar
13:34 πŸ”— Schbirid it seems like what i really want to use is http://aws.typepad.com/aws/2010/11/amazon-s3-multipart-upload.html
13:34 πŸ”— Schbirid http://archive.org/help/abouts3.txt says "We support multipart uploads."
13:35 πŸ”— nobody1 what about rsync?
13:35 πŸ”— Schbirid wait, i totally did this before, didn't i
13:36 πŸ”— Schbirid nah, don't think i can use that for archive.org
13:40 πŸ”— Schbirid or not
13:40 πŸ”— Schbirid wasn't there something where archive.org would automatically join split files? i vaguely remember something
14:08 πŸ”— Nemo_bis Schbirid, yes that's it
14:08 πŸ”— Nemo_bis but it's a way to upload a single file faster
14:09 πŸ”— Nemo_bis not to join tars or so
14:23 πŸ”— Schbirid yeah, noticed that :\
14:23 πŸ”— Schbirid guess i will move the big ones to my server first
14:25 πŸ”— Nemo_bis Schbirid, what's the point?
14:27 πŸ”— Schbirid that i dont want my poor connection saturated for 6 days
14:29 πŸ”— Schbirid heh, i should just redownload that chunk on the server
14:34 πŸ”— Schbirid next slice is up http://archive.org/details/FileplanetFiles_10000-19999
14:57 πŸ”— lrkj I've been grabbing images from nukes messages (from the warez scene): http://www.mediafire.com/?3br2369bs58td are those that were still online and of which I had an url (from doopes.com)
15:09 πŸ”— LordNlptp http://nesdev.parodius.com/bbs/viewtopic.php?p=92701#92701 <- argh is that someone from here? i hope not...
15:09 πŸ”— LordNlptp you can just ASK koitsu for a site image
15:18 πŸ”— DFJustin someone asked him and he got butthurt
15:46 πŸ”— shaqfu Schbirid: Wiki page for Fileplanet is up
15:47 πŸ”— shaqfu If you'd like, I can let you use my wiki login to keep the file mark updated
15:47 πŸ”— balrog_ DFJustin: *sigh* really?
15:48 πŸ”— shaqfu balrog_: You weren't around for that episode?
15:48 πŸ”— balrog_ no?
15:49 πŸ”— shaqfu I tried to wget a list of x.parodius.com domains and was blacklisted
15:49 πŸ”— balrog_ that was you that he was complaining about in that post? :/
15:49 πŸ”— shaqfu Someone asked the admin for help, and he got "ethically offended" or something
15:49 πŸ”— shaqfu balrog_: Yeah :P
15:50 πŸ”— balrog_ first of all, did anyone use the wiki downloading tools to properly mirror any wiki content?
15:50 πŸ”— balrog_ "such utilities get stuck indefinitely on forums/boards"
15:50 πŸ”— balrog_ I can see that happening ...
15:50 πŸ”— balrog_ what I'm afraid is that some of the domain/host maintainers are no longer around
15:50 πŸ”— balrog_ and therefore won't move their sites :/
15:51 πŸ”— yipdw I thought people who were downloading that site were pacing themselves
15:51 πŸ”— shaqfu My connection is slow enough that it auto-paces :P
15:51 πŸ”— balrog_ shaqfu: um no
15:51 πŸ”— yipdw obviously not slow enough for that guy
15:51 πŸ”— balrog_ it's not about speed as much as connection rate
15:51 πŸ”— yipdw his graphs show outbound traffic at around 400-600 kbit/s
15:51 πŸ”— balrog_ or "hammering"
15:51 πŸ”— yipdw that too
15:52 πŸ”— balrog_ use --random-wait
15:52 πŸ”— shaqfu Yeah, that's what I didn't use
15:52 πŸ”— balrog_ together with --wait
15:52 πŸ”— shaqfu It was the sustained connection that did me in
15:52 πŸ”— balrog_ yeah
15:52 πŸ”— balrog_ are you still blocked?
15:52 πŸ”— shaqfu Yep
15:52 πŸ”— balrog_ :[
15:53 πŸ”— balrog_ and why did he get "ethically offended"?
15:53 πŸ”— balrog_ because that's "other people's stuff"?
15:53 πŸ”— shaqfu Yep
15:53 πŸ”— balrog_ bleh
15:53 πŸ”— balrog_ well do we have a list of what's archived and what isn't
15:54 πŸ”— shaqfu IIRC we're going to wait until the domain's about to close, then go through again
15:54 πŸ”— balrog_ I wouldn't wait that long
15:54 πŸ”— balrog_ since from what it looks, he'll block anyone who causes above-normal usage
15:55 πŸ”— balrog_ you'd have to scale it so that the extra usage is undetectable
15:55 πŸ”— balrog_ like 1 request per 20-40 seconds
15:55 πŸ”— balrog_ which means, painfully slow archiving
15:55 πŸ”— balrog_ also some sites are already taking stuff down
15:56 πŸ”— yipdw also, if you haven't been there before, it's probably not worth doing
15:56 πŸ”— balrog_ I've been there now and again
15:56 πŸ”— yipdw from that post I gather that that admin is crazy enough to go through his access logs
15:56 πŸ”— yipdw and check for access patterns in that
15:56 πŸ”— balrog_ yeah :|
15:56 πŸ”— yipdw in other words, fuck him
15:56 πŸ”— balrog_ as I said, if you keep usage down enough so that it doesn't cause a spike in the graphs... he won't see it
15:57 πŸ”— balrog_ some wget speed limits and wait times should prevent this
15:58 πŸ”— balrog_ limit to 20 KB/s and 20-40 second wait
15:58 πŸ”— balrog_ will take weeks to archive
15:58 πŸ”— balrog_ also hit the smaller sites first
15:58 πŸ”— balrog_ the ones that don't have forums
15:58 πŸ”— balrog_ leave the worst (ones with forums) for last
15:59 πŸ”— yipdw it'll probably have to be you or someone else who has a pattern of previously visiting that site
15:59 πŸ”— balrog_ also, check like http://www.foxhack.net
15:59 πŸ”— balrog_ "So please don't go around making backups of my site's contents. Doing it without permission isn't cool." Ҁ” basically, don't release mirrors of sites that are not going away
15:59 πŸ”— balrog_ yipdw: I don't think he'll check if he doesn't see a spike in the graphs
16:00 πŸ”— balrog_ shaqfu: can you send me the *.parodius.com list?
16:00 πŸ”— balrog_ is it what appears as "Hosts" on parodius.com?
16:05 πŸ”— shaqfu balrog_: Plus others
16:05 πŸ”— shaqfu aggro may have put it on the wiki; let's see...
16:05 πŸ”— shaqfu Yeah, that's the list
16:06 πŸ”— balrog_ basically, people have to go to each one and see what's there
16:09 πŸ”— shaqfu I think SketchCow may have gotten in touch with koitsu; you may want to ask him what the plan is right now
16:09 πŸ”— balrog_ imho the most important ones are the static sites whose maintainers have disappeared
16:12 πŸ”— shaqfu Anyway, I gotta run for a bit
16:12 πŸ”— shaqfu Schbirid: Let me know if there's anything you need updated on the wiki
16:12 πŸ”— shaqfu (I think we're going to need a more efficient way to do this, esp. at higher numbers)
16:55 πŸ”— mw_gt free comic book kh43.com
16:56 πŸ”— Schbirid shaqfu: so far it is just me so i am comfortable using my own site to track it atm
16:56 πŸ”— Schbirid but yeah, if we want to get it all, then it will be hell
17:04 πŸ”— brandonbe http://www.reddit.com/r/AnythingGoesVideos/comments/tbby1/sabrina_johnson_fucked_by_2000_cocks_for_2_days/
17:06 πŸ”— SketchCow I assume he wants us to archive all the cocks
17:07 πŸ”— SketchCow All these cocks will be lost in time, like tears in rain
17:07 πŸ”— SketchCow Wait, this isn't rain
17:07 πŸ”— green3 that's not milk
17:07 πŸ”— SketchCow Oh goddamnit it isn't rain IT ISN'T RAIN GETITOFFGETITOFF
17:08 πŸ”— DFJustin I don't think getting off is the problem here
17:08 πŸ”— mistym SketchCow: And I'll never find that recipe again
17:09 πŸ”— mistym I've got to say, these spambots are pretty weak. I really have to wonder who would be convinced by the "free comic book kh43.com" one earlier
17:10 πŸ”— green3 why was the reddit link spammed?
17:10 πŸ”— SketchCow So you'd go to it and click on the link of the story.
17:15 πŸ”— green3 oh
17:17 πŸ”— Coderjoe is it calpis?
17:17 πŸ”— Schbirid i kinda want to watch that video ¬_¬
17:19 πŸ”— SketchCow http://www.youtube.com/watch?v=a4NCnH7RPZY watch that instead
17:21 πŸ”— Schbirid nice
17:21 πŸ”— Schbirid vocals are super silent though
17:37 πŸ”— SketchCow They're prominent here. It's your fault, somehow
17:39 πŸ”— oli what a fantastic video SketchCow
17:39 πŸ”— oli i love it :D
17:45 πŸ”— Schbirid 55k-60k here i come
17:46 πŸ”— shaqfu I dread to think of how large the entire FP set is
17:46 πŸ”— shaqfu 500GB, at least
17:46 πŸ”— Schbirid ha
17:47 πŸ”— Schbirid well, unless there is a gap somewhere, there are ~200k to go. last 5k were almost 25G. so if that size would not increase it would be 1TB
17:48 πŸ”— Schbirid but, the previous 10k were <20, so this might make a big jump
17:48 πŸ”— shaqfu Schbirid: How many legit files do you have?
17:49 πŸ”— Schbirid http://www.quaddicted.com/forum/viewtopic.php?pid=251#p251
17:49 πŸ”— shaqfu Okay, about a quarter done
17:49 πŸ”— shaqfu Wasn't sure if it was denser at the end or beginning
17:54 πŸ”— Schbirid i wonder if i manage mount the tar files from archive.org with fuse
17:56 πŸ”— Schbirid that would be stupid, i could just link to the tar viewer instead
18:51 πŸ”— shaqfu Sigh; barring a miracle, the Arneson papers will end up split in private collections...
18:56 πŸ”— shaqfu So much for effective research into early game design
18:59 πŸ”— Schbirid http://archive.org/details/FileplanetFiles_40000-49999 done
19:07 πŸ”— shaqfu Schbirid: Do you plan on updating that forum thread regularly?
19:07 πŸ”— Schbirid sure
19:08 πŸ”— shaqfu I'll link to it then
19:08 πŸ”— Schbirid people can post without registering too, just need to know/google some fact about quake (spam protection)
19:08 πŸ”— Schbirid where?
19:09 πŸ”— shaqfu http://archiveteam.org/index.php?title=Fileplanet
19:09 πŸ”— Schbirid cheers
19:09 πŸ”— Schbirid 87,190 is underscor's number i guess?
19:10 πŸ”— shaqfu Yeah
19:10 πŸ”— Schbirid nice
19:10 πŸ”— shaqfu Kinda surprised that ~60% of IDs are dead
19:13 πŸ”— Schbirid yeah
19:13 πŸ”— Schbirid that often happens though
19:21 πŸ”— Nemo_bis Schbirid, add a fileplanet keyword at least :)
19:35 πŸ”— bbot_ SAN FRANCISCO (MarketWatch) Ҁ” Patti Hart, the Yahoo Inc. board member who chaired the committee that led to the hiring of Chief Executive Scott Thompson, is stepping down, AllThingsD reported Tuesday, citing unnamed sources.
19:40 πŸ”— Nemo_bis ?
21:05 πŸ”— SketchCow Wow
21:05 πŸ”— SketchCow That's crazy
22:09 πŸ”— shaqfu This may be problematic; the .html pages being pulled off FP don't contain metadata...
22:12 πŸ”— shaqfu It needs to grab /fileinfo/ also

irclogger-viewer