[00:46] http://www.fileformat.info/info/unicode/char/1f4a9/index.htm [00:52] Who's extending their Unicode coverage today!? OOOOH YOU BET I AM! [00:55] hah [00:55] that's one I hadn't seen [01:01] Anyone in here have a fitbit? [01:01] I'm considering getting one, as it seems like it would actually get me off my ass [01:13] underscor: for the nerdy types, stuff like that are p.cool [01:13] :D [01:13] I don't have a fitbit, but I've used runkeeper on my phone for my various run attempts.. kind of neat graphing that sort of stuff [01:13] yeah [01:14] I also have a Zeo sleep tracker, chart my sleep patterns online [01:14] that's cool [01:15] I've been using sites like sparkpeople (which gives you points for logging food tracking and logging in), and dailyburn (which integrates with the sleep tracking stuff) [01:15] but I've also just started messing with fitocracy, which is an exercise tracker that gives you levels and achievements [01:15] Woah! [01:15] Zeo looks awesome!!!!!!!!!!!!!!!!!!!! [01:15] hey, i just got invited to that site too. [01:15] haha [01:15] what the hell... [01:15] You need an invite? >:( [01:15] bizarre [01:16] underscor: yeah, still beta. I can send you one if you want [01:16] yeah, you want one? [01:16] <3 [01:16] or MorbusIff can [01:16] abuie@kwdservices.com [01:16] I don't know either of you, but yuo can join my.. fit.. team.. thing [01:16] i need your credit car... dammit. [01:16] friends? fit friends? idkwtf it's called [01:16] Jofo on there [01:16] lol [01:16] MorbusIff: you gonna send it or should I? [01:16] i will. it's up. [01:16] k [01:16] This zeo thing has got to be expensive [01:16] * underscor checks [01:17] sent [01:17] thx [01:17] underscor: $200 / $300 [01:17] Aw man [01:17] haha [01:17] underscor: $300 comes with "coaching" that's supposed to help identify what you're doing wrong with the sleeping [01:17] There's always christmas! [01:18] Oh cool, I have to be 18 for fitocracy [01:18] i swear, this is the last time im rebuilding this damn website [01:18] I love lying about my age :D [01:18] if it fails again, im sending a fat stack of DVDs to you archive guys :) [01:18] hardware docs, ITT [01:20] the site being thehardwareproject.org (beware of it's infancy, this iteration is a day old) [01:20] I HAVE A NEW FITNESS FRIEND. [01:20] heh. [01:20] :3 [01:20] you wouldn't happen to be stiletto would you lowtekk [01:21] MorbusIff: :3 [01:21] negative [01:21] underscor: so you lied about your age, how old are you? heh [01:21] docs are being merged into the system as we speak [01:21] 17 [01:21] word [01:21] (I'm old) [01:22] also: followed [01:22] haha [01:22] I can hook you up with some docs [01:23] DFJustin: the doors are always open, although the upload bit of the site has yet to be resurrected [01:23] should be up in a day or two :) [01:24] Yay, I'm level 3 with the mountain biking I did earlier! [01:25] beat me with my hurried bodyweight exercise, only lev2 [01:25] anyway stiletto is the main doc hunting guy for mamedev/messdev and it might be worth getting in touch with him at some point http://www.mameworld.info/ubbthreads/showprofile.php?Cat=&User=2148 [01:26] ahh, thanks for the tip, i'll have to do that [01:26] Man, I reallllllly want one of these now [01:26] I think he was starting a website but the url escapes me at the moment [01:26] do you know what the scope of his doc hunting is? types of docs/companies/etc i mean? [01:27] mainly datasheets for chips (cpu, sound, ...) used in arcade hardware, but there is a lot of crossover with computers of the same era [01:27] underscor: what's that, fitbit? [01:27] A zeo [01:27] oh, haha [01:28] yeah. it's p.cool. I mostly use it for the "wake me during a light sleep cycle" thing vs. a normal alarm [01:28] I've wanted something like that since I was like 10 [01:28] he uploaded some stuff to our wiki at http://mess.redump.net/datasheets [01:28] no joke [01:28] Jofo: WAIT, IT DOES THAT?!?!?!? [01:28] omfgt [01:28] omfg* [01:29] but there's a lot more on the project ftps [01:29] I love it when I wake up at a light cycle [01:29] underscor: quite. you set a window and a wakeup time. It tries to wake you up prior to the wakuptime during your window, else it just wakes you up @ time [01:29] there's other, cheaper products that do it.. I think [01:29] But probably not as effective [01:29] DFJustin: cool, im having a look right now [01:30] underscor is living in the tron mainframe [01:30] my site keeps exploding on the backend, i wish i had more to show [01:30] Because the reading-the-brain-waves thing is pretty advanced [01:31] DFJustin: definately some overlap [01:31] stiletto may get a kick out of THP when it's back up [01:34] underscor: well, there's a watch that does it, but it uses an accelerometer to measure your stages [01:34] movement = light [01:34] oic [01:35] standard is $100 [01:35] "elite" is $150 [01:35] DFJustin: is stiletto in here much? [01:35] http://www.sleeptracker.com/buy_sleeptracker_s/42.htm [01:35] war5102_: I've only ever used public transit in los angeles twice. I've been there fewer than 30 days total. I'm a general believer in mass transit. [01:35] Jofo: Not as cool as something that reads your brain :D [01:35] I haven't seen him in here but he did mention sketchcow recently so he may be hiding [01:36] underscor: quite. [01:36] My roommate has a Zeo. Uses it to track his polyphasic stuff. [01:36] we've got schematics and things for obscure european computers too, I gotta run out the door now though [01:36] Ah, Stiletto. [01:36] I want to become a bi- or tri-phasic sleeper [01:36] DFJustin: cool, later [01:36] OK, so how much of this ongoing conversation is archiveteam related. [01:36] 100% [01:37] Well my roommate has an archive of sleep data... [01:37] They need to invent a way to archive dreams [01:37] woop woop woop off-topic siren [01:37] And if people are awake for more of the day, more archiving can get done! [01:38] I disagree. [01:38] Most of that phasic bullshit turns you into an 80% functioning zombie [01:39] And miss one of your little dream naps and you're fucked for 3 days. [01:39] What if you're already an 80% functioning zombie? [01:39] Also, and don't cry a little tear, you're kind of committing suicide [01:39] Studies show not getting full rest kills you [01:39] And all this phasic stuff is like taping down the door jamb so the door seems locked but you're going in and out. [01:39] Total hack. [01:39] * underscor whispers, "SketchCow should go to bed earlier then" [01:40] I go to bed late, and wake up late. [01:40] Usually noon or 1pm. [01:40] oic [01:40] I go to bed late and wake up early [01:40] :( [01:40] I don't get up at 9am after bedding at 5am [01:40] I tried that [01:40] I don't have jobs that require that. They're all achievement based now. [01:40] Didn't work very well [01:40] Oh, you can do it in an emergency. [01:41] Not for a week straight [01:41] No. [01:41] changing phase the stupid way [01:41] anyway Lowtekk feel free to lurk in #messdev since it's kinda off-topic [01:41] NONE OF THIS IS ABOUT ARCHIVETEAM [01:41] * underscor hides [01:41] I've got something somewhat archive related- I've stumbled onto an old DOS BBS-hacking game, and hoping someone can tell me more about it- it appears you really need to hack it in order to get anywhere [01:41] SketchCow: Here, have a deepfried oreo [01:41] That's not archive related. [01:41] That's just finding something and needing tech support [01:41] oh well [01:41] I'm in the process of moving the data off flophouse and blindtiger wholesale. [01:42] We have a lot of data. [01:42] I am about to start uploading friendster fun-paks to the archive, too. [01:42] http://www.archive.org/search.php?query=collection%3Ablip-magazine&sort=-publicdate [01:42] Ooh, those will be a good stress test [01:42] here. [01:43] Every one a gem. [01:43] If someone felt like sending me the table of contents for each issue, they'd be my hero. [01:43] There's only 8 issues. [01:43] Wait, 7. [01:44] SketchCow's got a lovely bunch of data... http://i.imgur.com/8Z59f.png [01:44] I'd volunteer, but I'm busy with my Linux Magazine and Linux Format metadata :( [01:45] You do enough as it is. [01:45] :D [01:45] Besides, I'm already your hero! [01:45] Where's jch, he's always whining about nothing to do. [01:45] ;_ [01:45] ;) [01:48] grazie [01:48] y'welcom [01:48] Yay, only 18 hours left on the first file in this rsync [01:48] I should have2TB with m to the archive [01:49] lol [01:49] should have brought a* [01:56] Adding in Commodore DiskUser now. [02:02] It's amazing how many commdodore magazines survived into the 1990s [02:02] SketchCow: I actually had a thought that may be answered elsewhere, but: if someone's friendster page was archived, would you delete it upon their request? [02:02] No [02:03] Do you need me to sugarcoat that? [02:03] Man, you look great today. [02:03] No [02:03] not even the pictures, which if taken by them, are copyrighted? [02:04] not trying to start shit, just curious [02:04] Nope. [02:04] Want to walk through a dozen scenarios? [02:04] Pictures of abusive husband [02:04] Notes in which they give their home address [02:04] Political leanings revealed in shout outs [02:04] Checking... [02:04] ...no [02:06] well, generally on sites like this, agreements are signed that generally forfeit copyright claims against the site itself.. but how would that hold up for a third party? [02:06] <- not a lawyer, just curious. [02:06] Yeah, here's the thing. [02:06] There's really no point in discussing this. [02:06] There's no benefit. [02:07] So let me go back to the original answer. [02:07] Man, you look great today. [02:07] No. [02:07] I do, thank you :) [02:07] no amount of sugar coating can fix you, Jofo [02:08] and the benefit could be just thinking about what could happen if someone intent on having their content removed decides to raise hell [02:08] although you missed the donuts sitting on your desk this morning [02:08] So, are you here to help archive team, Jofo, or are you just hanging around? [02:08] Because otherwise, get the fuck out. Now. [02:09] I plan on helping once I get through my basic python self study I'm doing, and/or there's something I notice in here that I could help with [02:09] Then take my delicious, helpful advice. [02:09] Shut the fuck up [02:09] Don't walk through 'scenarios" [02:09] Help the people who can't be helped [02:10] 407GB 1:32:20 [75.2MB/s] [ <=> ] [02:10] Now that's a tarfile [02:15] http://www.youtube.com/watch?v=y-AXTx4PcKI&feature=related [02:15] Instead of "Closing", imagine "Archiving" [04:12] Nightmarish metadata adding now. My own fault [04:52] Ops, please. [05:25] are you aware of this SketchCow http://www.retromags.com/ [05:34] Yes [05:45] OK, who has a megaupload account. [05:46] I had one, but I think it expired or whatever [05:47] yeah [05:48] i do [05:50] need stuff downloaded? [05:53] Well, yes. [05:53] retromags.com has some stuff. [05:53] But now that I'm doing stuff, I am worried about sending someone to piecemeal download [05:53] When I might have full archives from elsewhere. [05:54] Because to be honest, I am POURING in magazines [05:56] You are raining magazines? [05:56] I mean, I am not sure where I am now, but it's going to be a thousand before long. [05:56] issues, not individual runs. [06:00] damn [06:00] That's awesome! [06:01] I've been given allowance to blow most of the week on this. [06:02] I an do an awful lot in a week [06:02] Allowance? [06:02] oh [06:02] I got a job, kid [06:02] I thought your job was to do stuff like this [06:25] Uh oh, addiction has set in [06:25] Now I'm adding magazines like a junkie [06:25] The most nerdy, bookish junkie ever [06:33] what's your recommended solution for scraping asp.net sites? [06:34] Right now I'm using a hacky solution with a mix of php, powershell and Burp Suite [06:34] but it's pretty tiresome :( [06:34] I can't answer. Can someone answer? [06:34] Bear in mind it is late, not everyone's here. [06:43] inv: lib-www-mechanize, in your favorite language, + random shit to make the posts happen [06:44] asp.net is assballs to scrape [06:44] I feel your pain [06:46] another possibility might be selenium driving a browser [06:47] Yeah I'm using internet explorer + powershell to manually click on things now [06:47] that's working pretty well [06:47] (controlling it via COM, using internetexplorer.application) [06:47] so it's not visible on screen etc [06:47] but I'm having some issues reading the DOM of iframes from powershell for some reason. [06:49] semi-related question: does anybody know about an http library that supports pipelining AND multiple concurrent connections? I've found one for erlang called ibrowse, but erlang is not exactly easy to use :) [06:53] There's a ruby one called typheous or something that does that, iirc [06:54] http://www.pauldix.net/2009/05/breath-fire-over-http-in-ruby-with-typhoeus.html [07:29] Speaking of annoying mandatory interactivity, is there a way to make mediafire bleed files without a captcha getting in the way occasionally? [07:34] I mean, I guess it's not too much to pay for a month as long as I make a plan of attack in advance. [07:39] Jofo: We don't care a shitflying fuck about copyfuckyrightandwrong, just so you know [07:40] SketchCow: That's an amazing clip, I should watch that movie again [10:34] anyone have experience with scanning microfiche? [11:05] SketchCow: megaupload is easily downloaded with jdownloader [13:02] chronomex: SketchCow might know someone at the archive [13:02] There's a whole team of people [13:02] just for microfiche [15:01] Last day to download Google Groups stuff!!!! [15:02] Is there any groups stuff left? [15:02] I'm getting errors from the tracker, but a few days ago when it was working, I was pulling stuff down. [15:06] I'm getting error 444 from the download tracker too. The discovery tracker still has work left. (The second or third round, I guess.) I'm running a few of those now. [15:06] Same here. [15:12] alard: approx how much stuff do you have on-disk from the ggroups attack? [15:13] Here: 12GB. The rest is with SketchCow. [15:14] That's probably more, but I don't know how much it is. I've rsynced that during downloading, when I had a vps with unlimited bandwidth but very little disk space. [15:15] I've got about 600GB sitting on my drobo at home. [15:15] That's totally amazing! Thanks! [15:15] That's a lot. [15:16] Ok, then I'm shutting down all of my ggroups crawlers. [15:17] What's next on the list? :) [15:18] Well, you could do laps of honor ;-) [15:18] The next thing would be to pull the info from the about pages and compile an index for files out of it [15:19] ... preferably in TXT and HTML format [15:19] ha ha [15:19] I'm just too busy ATM [15:19] I'll check out the indexing. [15:20] :-) [15:21] Hurray, I just downloaded a few more groups! [15:21] Apparently there are still a few left. [15:23] Yes, new ones that have been founded since we have started. [15:26] I thought google had immediately turned off new file uploads though? [15:27] nice http://netlabelism.com/hosting-netlabel-releases-at-the-internet-archive [15:36] * ndurner_o shrugs [15:42] swebb1: someone mentioned Drobo to me just the other day [15:43] db48x2: Yea, I like the ability to raid using mixed-size drives. [15:44] db48x2: also, I run an rsync server on it, so I just rsync to/from it most of the time. [15:44] I've been looking for a way to do iSCSI (or something similar) [15:44] db48x2: The larger models do iSCSI, but not the 4 or 5-bay versions. [15:46] swebb1: do you know if you can turn off the RAID that they advertise? [15:46] I would want to use ZFS, which wants to access the disks directly, not a slice of a RAID [15:48] chronomex: People scan microfiche at the archive. [16:33] db48x2: ZFS is neat and all, but doesn't allow for mixed-size drives, but you can roll your own solution with opensolaris and ZFS that's very capable. [16:36] true, but I don't really mind that [16:37] it can do mixed-size vdevs which is good enough for me [16:39] what I really want is the smallest possible setup that can expose a set of drives as iSCSI targets [16:57] Yea, I use vmware's ESXi server for my crawler VMs (and other stuff) and have 600GB of storage per VM host, but it would be nice to just grab a pool of storage over iSCSI for them. [16:57] I don't have iSCSI set up at my place at all at the moment. [16:57] Just NFS. :) [16:58] swebb1: what kind of space are you looking to end up with? [16:59] and do you prefer an appliance or would a small/cheap box with a bunch of drives suffice? [17:00] err, db48x2 i mean [17:02] I've got 600GB on both of my vmware ESXi hosts. I've got about 2.7TB usable (4TB raw) on my drobo. I've got a few more linux machines with 40GB-ish each that I use for misc purposes and a web host with good bandwidth. Also, amazon is offering free incoming bandwidth, so I'm figuring that for the next big pull, I'll just spin up a couple of spot instances on AWS and be done with it. [17:30] Lowtekk: I want to end up with lots of space that I can expand forever [17:30] Lowtekk: ZFS lets me add vdevs to an existing pool whenever I want, and all the filesystems in the pool can then expand to use the new vdevs [17:31] I've been looking at fiber channel and iscsi because they both would let me add new drives without taking the others offline to change the wiring [17:32] iSCSI and linux's volume groups can make storage management really nice and flexible. I agree. [17:35] In practice I might end up changing the topology of my hubs, or going from gigabit to 10 gigabit ethernet or whatever, but in theory I'd never have to change technologies because I ran out of pci slots or sata channels or room in the rack or got past the maximum length of a cable [17:39] amen [17:39] iSCSI overview on youtube for those of you who don't know what know iSCSI is about: http://www.youtube.com/watch?v=QCDiz9C8Vvw&feature=related [17:40] my webserver on ESX pulling from a Dell Powervault, it's a couple years old but adequate for it's task [17:40] FreeNAS might be an option for an iSCSI server. [17:41] cant beat the price [17:44] FreeNAS offers ZFS, so you get snapshotting and all of the OpenSolaris goodness and don't have to figure out how to manage Solaris, instead you have a web gui for management of it all. [17:45] meh, web guis are more pain than they're worth [17:48] but yea [17:48] now that opensolaris is dead, I'll be sticking with linux [18:05] Hey [18:05] Going to be spwnding some time out back in the info cube for the afternoon [18:09] SketchCow: Have fun and don't get crushed [20:06] FreeNAS is based on freeBSD, so might be easier to support, I'm not sure. Personally, I think that iSCSI config under linux is not that easy to do either, but on ESXi is way easy, then just mount in linux as a device. [20:14] Wasn't crushed!! [20:18] http://www.benpurdy.com/2011/08/minecraft-in-real-life/ [20:40] FreeNAS as an iSCSI server and VMware ESXi using it howto video: http://www.youtube.com/watch?v=Dc20IT1msAk&feature=related [20:55] Converting the month to a number. [20:55] Here's what I plan to do. [20:55] In the collection named ZX-spectrum-magazine... [20:55] OK, then, ZXComputing_Apr_1986.pdf gets the love. [20:55] root@teamarchive-0:/3/MAGS/ZXComputing# ./ingestor ZXComputing_Apr_1986.pdf [20:55] I will add an item called ZX-spectrum-1986-04. [20:55] I will say this dates to 1986-04. [20:55] I will give it the title of ZX Spectrum Magazine (April 1986). [20:55] Yes, I finally had to write a huge script with test cases, what-ifs and the rest. [20:55] And a config area, don't forget the config area [20:56] With the old classic, adding a string or no string for yes/no sets [20:56] config area? [20:56] # WHAT THE FUCK DO ALL THE COLUMNS MEAN [20:56] COLLECTION=ZX-spectrum-magazine [20:56] ITEMPREFIX=ZX-spectrum [20:56] TITLEPREFIX="ZX Spectrum Magazine" [20:56] YEARCOLUMN=3 [20:56] SEPARATOR=_ [20:56] MONTHCOLUMN=2 [20:56] WORDMONTH=1 [20:56] BYISSUE= [20:56] ISSUECOLUMN= [20:56] TEST=1 [20:56] Now I can just do stuff THERE, and set it for an entire range of magazine [20:57] ah, you have a different file of that type for each collection [20:57] The key is, I can now go through a pile of mags without any intervention. [20:57] I can blast 108 issues Shitbox 2000 Magazine without touching them. [20:57] I was already fast, now I'll open 10 parallel scripts and blast it. [20:57] ls [20:57] cat [20:57] done [20:57] ps -aux [20:57] automation is the best [20:58] so at what point do they hire your shell script? [20:59] Don't hire the lemonade [20:59] hire the guy who makes the lemonade [21:00] The make him citrus manager [21:07] * SketchCow burns your house down. With the lemons! [21:08] ZXComputing_Apr_1986.pdf ZXComputing_Aug-Sep_1982.pdf ZXComputing_Dec-Jan_1984.pdf ZXComputing_Feb-Mar_1985.pdf ZXComputing_Jun-Jul_1983.pdf ZXComputing_Nov_1986.pdf ZXComputing_Sep_1986.pdf [21:08] IN THEORY, the script is going to deal with this. [21:15] heh [21:16] Here it goes [21:16] I have it tracking time to see what it does. [21:17] It's called "Ingestor: The Nerdiest War Robot" [21:21] Ended Wed Aug 31 14:20:40 PDT 2011 [21:21] Started Wed Aug 31 14:15:20 PDT 2011 [21:21] So a little less than 5 minutes. [21:21] It added 36 magazines. [21:21] So I guess that works out. [21:21] I watched it, but I didn't need to "do" anything. [21:24] awesome [21:30] OK, new one running. [21:30] 76 issues. [21:30] Adding a new one every 7 seconds. [21:31] friendster.004300001-004399999.tar.bz2 [21:31] 367420145664 98% 929.69kB/s 2:00:41 [21:31] looks like that file may eventually finish [22:20] hi SketchCow, I'm hoping to convince my company to donate the old IT materials they're planning to throw out to the Internet Archive, but I don't see any info on physical donations [22:22] I can take it [22:22] I can give you an address [22:22] how much is this [22:24] gotta go take care of something real quick- tell you more when I get back [22:35] okay- back [22:36] right now, a box or so [22:39] SketchCow: I asked about microfiche more in respect to suggestions for image acquisition; I've got a few thousand film sheets of assembly listings that I'd love to have in digital form. [23:23] OK [23:23] Well, there's people and processes at archive.org. [23:27] Is there an easy way to do an alpha shell expansion in bash? [23:27] There has to be [23:27] Something like "mkdir [a..z]" [23:27] that makes "a" "b" "c" "d" etc [23:32] python! [23:32] (or perl) [23:32] yeay [23:32] yeah* [23:32] That's what I ended up doing [23:33] I was just surprised it's not a builtin [23:37] Later forms like Bash or so on might have it [23:37] Not sh [23:58] I'm using zsh [23:58] actually, bash for this script