[01:54] I haven't looked at what is written in github yet, but this might be useful: http://multimedia.cx/eggs/multiprocess-fate-revisited/ [03:42] grr [03:44] hi [03:44] sup [04:17] hi [04:26] heyy [04:31] what you up to S[h]O[r]T [06:05] not much was playing battlefield 3, sleeping now [06:05] starting doing some memac stuff again [06:47] MORNING [06:49] morning [06:49] SketchCow, quick pm? [08:08] good morning SketchCow [13:44] http://archive.org/stats/s3.php some beta s3 stats [13:44] pretty colors :D [13:49] mr leaky man [13:49] leakyscor [13:50] pretty colours indeed [14:11] 11 episodes of crankygeeks to go [14:12] haha, great name [14:14] * SmileyG ponders if he can watch at work [16:46] does wget-bzr save the commandline used to the warc file? [16:46] might be handy information [16:47] If you're saving to a WARC, then yes. It adds a WARC-header, if you do less on the WARC you'll see it (There's a lot of text. Each data you save has a header, which is plain text - even if the data is gzipped) [16:48] awesome [16:49] heh, eeek. now it will be forever known that it was me who mirrored the gamespy forums [16:49] i used a user-agent with contact address [16:50] Well, the WARC-header part isn't displayed publically if you use a tool like Wayback, warc-proxy or such [16:51] yeah, i am actually fine with it [16:51] I bet it's scrubbable - but I recommend always setting the user agent by yourself - just in case :) [16:51] the only worry would be gamespy being angry [16:51] but IGN is so dumb, they probably wont even notice [17:38] HEY GANG [17:38] OK, great talk/keynote given [17:38] Did us well. [17:38] Favorite line: "We are not hackers, we're just very, veyr energetic." [17:39] Anyone look at Meeblo yet? [17:42] Meebo, I mean. [17:44] SketchCow: legalize was looking for you in #discferret [18:14] SketchCow: I uploaded all crankygeeks onto archive.org [19:50] night [20:48] Hmm, anyone actually gotten their OVH giveaway server? [20:50] underscor: yeah, Schbirid actually has one [20:51] hmm [20:51] wonder why mine's taking so long [20:56] they know what you're intending to do with it [21:21] WHERE'S THE HUGS [21:25] The hugs are in a bag over there, not many left though grab em while you can..... [21:25] also, "site:fileplanet.com inurl:hosteddl" returns 6400 results, seems small, is that really all we need... [21:25] Oh god, bag hugs [21:27] 4300*, hmm [21:35] http://lanlsource.lanl.gov/hello is interesting [21:42] hrm [21:56] Anyone here working on the fileplanet archive? [22:36] SketchCow: did you ever do a post on your workflow for capturing VHS tapes for GDC? I'm doing some now, and I would be interested to know what your workflow was [22:46] how about using: inurl:fileplanet.com inurl:hosteddl [22:46] unfortunately, google tailors search results to the user, as well as gives different results depending on what cluster you ask [22:49] Coderjoe, that gives roughly the same give or take a few results [22:50] i was given "about 5460" results. they might vary a bit compared to yours, however [22:52] yeah that threw me a little too, see http://support.google.com/webmasters/bin/answer.py?hl=en&answer=70920 [22:53] i managed to get only 550 actual results without dupes, seems there no long indexing as many as 4000+ may have to find another way to get the list as I'm sure there are more than 550 files hosted [22:53] it would be nice if there were a way to tell google search that you don't care about speed [22:54] yeah, they dont let you list more than 100 result per page and and that's only due to speed, silly google [22:55] whee [22:55] their help no longer matches actual behavior, it seems [22:56] http://support.google.com/websearch/bin/answer.py?hl=en&p=adv_sitespecific&answer=1734233 [22:56] link: does not appear to work. google asks me "did you mean?" and shows me other crap [22:59] oh, and NOW it works [22:59] but doesn't show anything [22:59] google overall is becoming a little too user friendly, I haven't really taken notice of the changes in functionality until now. [23:00] link:fileplanet.com/hosteddl shows no results [23:00] yes [23:01] nope, none [23:01] I have been hating on google for over a year. it seems impossible to find anything I'm looking for anymore [23:01] hmm, we'll see what Schbirid says when he gets up, i sent him the results but I'm still convinced there should be more. coffee is needed, brb [23:01] *slams head* argghhhh [23:02] drives me nuts when i run into stuff like this [23:03] entire news backlog increments backwards in 2 day iterations, no pagination, and only selectbox form for navigating the archive [23:03] so now i have to build a 2 day incrementing list of links that dates back 13 years [23:05] https://duckduckgo.com/?q=site%3Afileplanet.com%20inurl%3Ahosteddl&kp=-1 [23:06] fyi on fileplanet: most of the results are uncrawlable [23:07] so whats coming up on search engines or fileplanet itself are just a fraction of whats really there [23:07] because they tried to redo fileplanet but just abandonded the project [23:08] if you try to navigate to Q3, in fileplanet itself, theres barely anything that comes up [23:08] just a few maps, a few skins etc [23:08] its like the DB is incomplete [23:09] but, the files are still accessible, if you can find the links to them [23:10] its a real mess [23:11] instence_, my thought exactly, but I'm still not sure why... gahh, looks like I landed on a frustrating project =/ [23:11] the general files have been downloaded, afaicr. what we're trying to get are hosted downloads [23:11] general files have been handled by just incrementing through the fileID space [23:13] another thing, other than user content, that we could probably help with as a whole: drivers. for example, abit is gone. if someone should need drivers for one of their boards, where do they turn? [23:14] okay so fileplanet.com/1000/download [23:14] you just incremented through that? [23:14] (typing with just my left hand, right wrist broken) [23:15] yes, that's what they did, iirc [23:15] ok cool [23:15] i'm going to trying and poke at this via another route to have some more specific archives with more context [23:16] like, if you go to planetquake, the files section, those are all hosted in fileplanet, but the files section at PQ has juicy info about each file [23:16] actually, I think they hit the fileinfo pages as well for the metadata [23:17] how does one view that off FPok cool [23:17] er sorry [23:17] ok cool, sounds good [23:19] i havn't had much time to look into the FP contect, since i'm barely picking up where i left off on a project i started last january, but god derailed due to life issues [23:19] FP content* [23:20] life sucks [23:21] always getting in the way [23:21] yea its hard to even motivate myself to work on tasks. i should be working on fixing life problems, but arm is broken and jack shit i can do *sigh* [23:22] but im forcing myself to work on some projects anyway, since i know eventually i would want to finish them when my state of mind is in a better place [23:23] and the data may not be around by the time that happens [23:23] so i'm sitting here archiving, cursing the universe at the same time [23:25] so lets say there is this file: [23:25] http://www.fileplanet.com/37595/download/ [23:25] what would the file info url be for it? [23:26] oh wait i think i see [23:26] bingo got it [23:26] http://www.fileplanet.com/37595/30000/fileinfo/ [23:27] so if the id is 25689, the second number has to be 20000