[00:13] What's all this about upcoming.org? [00:26] Deleted in 10 days [01:54] so the top archivers are on many ips to get around ip-based throttling? [02:02] there's also a lot of delay inherent in making http requests etc that make one instance less efficient [02:03] ah yes [02:08] one day ill get a cool mask [02:08] doh [02:09] pwn [02:09] mask pwn [02:09] darn split screen irssi [02:09] i often forget to see what channel im typin gin [02:09] doesn't matter [02:17] i just say 'work' [02:17] (3879793749374 [02:20] Feels good to have someone go "Do you have any PC Gamer CDs" and I aim him at 223 [02:20] yes bitch [02:21] commander keen ftw [02:31] http://scenesat.com/video/3 [02:33] waht are we viewing? [02:33] looks like an awesome viewin groom [02:38] some people talking [02:39] is this notacon? thats going on this weekend i think [02:39] apparently its PixelJam 2013 @ notacon yeah [02:41] * DFJustin headbangs [02:42] * WiK busts out some glowsticks [02:43] i only make it out to two cons [02:43] defcon/bsides and shmoo [02:49] This is one of only 3 Demoparties that run in the US. [02:49] And the competition begins in about 5 minutes. [02:49] So this is worth enjoying while we destroy Upcoming/posterous [02:49] is there an irc for this [02:51] Yes [02:51] #scenesat on EFnet [02:52] ya, im diggin the tunes [03:36] WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD [03:37] HIE THY GOOD SIR [03:37] The secret word is "yahoosucks" [03:38] Wow. Not surprised, though. Looks like Upcoming is soon to be Downgoing [03:38] Yes sir. [03:40] So, what goes on here in these parts? [03:40] We plan and plot [03:40] Discuss and bitch [03:40] Occasionally, someone comes in and offers us, just a galactic amount of porn malware [03:40] Seems like any old IRC channel to me. [03:42] Is Jason going to introduce Project Archive PornMalware? Seems like a good idea. Future historians might want to study that, after all. [03:44] ... [03:47] Hmm, I wonder if we'd archive xtube if it went down [03:47] I don't see why not [03:48] :D [03:48] XD [03:48] Well, I mean [03:49] a.o already has like tens of TBs of youporn, xtube, et al as part of the crawl data [03:56] what about literotica? [04:05] What do you mean mirror xtube [04:05] Oh, you mean do an rsync from my USB drive here [04:34] only until manwin goes under would anyone realize how many porn companies and tube sites are under their wings [08:12] anyone heard of webcitation.org? They will shut down end of 2013 "unless we reach our fundraising goals to modernize and expand this service." [08:12] "What is WebCite®? [08:12] WebCite®, which used to be a member of the International Internet Preservation Consortium, is an on-demand archiving system for webreferences (cited webpages and websites, or other kinds of Internet-accessible digital objects), which can be used by authors, editors, and publishers of scholarly papers and books, to ensure that cited webmaterial will remain available to readers in the future. If cited webreferences in journal articles, books etc. are not ar [08:13] unless you shut down idiots [08:23] yey blackmail [08:27] Damn. ”Item events-323250: Step 3 of 7 Yahoo rate limit (error 999). Waiting for 300 seconds...” [08:28] qwebirc34: it'll keep trying, don't worry about it [09:22] also getting a lot more 999 now [09:24] C-Keen: Yeah, most of us have heard of WebCite and their situation [09:30] joepie91, if you want: it's time for another update of the pypi seesaw package. 0.0.15 this time. [09:33] The WebCite website says they'll stop accepting new submissions. That's slightly different than shutting down. [09:48] On osx, is there a simple way to get a second (or [n:th]) appliance running? Seems to fire up fine but gets stuck on the fact that 8001 is already busy. An easy way to tell the httpd to run on another port might help? [09:57] Check the virtualbox settings, there's a port forwarding option there. [10:00] could also modify the port in /home/warrior/warrior-code2/warrior-runner.sh [10:00] But then you still have to change the port forwarding. [10:04] gotcha, thanks for the tip, I was here looking for the same question [11:42] Hmm. Didn't work. Updated both the script and the settings in Virtualbox and no go. I'm sure I missed something obvious. Anybod else got it working on OSX? [14:17] OK, back into the archiveteam.org spam user kill trenches [14:17] daily penance for not handling the spam problem sooner [14:36] alard: package is updated [15:01] joepie91: Thanks. [15:10] np :) [16:15] Crap, I 508'd archiveteam.org [16:15] Wait, nevermind. [16:16] The machine seems to be hammered a little. [16:18] Sorry.. [16:18] it's not you. [16:18] oh. [18:24] So I picked up a RPi this week to run some archiving jobs on, and was wondering if Warrior could realistically be ported to it [18:48] doubtful [18:48] you maybe able to run the scripts though. [18:49] Even if it was just "get a job from tracker, run, rsync back, repeat" that'd be fine [18:49] Since ATM it looks like it requires far too much RAM [18:52] shaqfu: yup, thats what you do when you run the scripts manually [18:52] the ram usage depends on the project, smallest is urlteam I think. [18:54] Smiley: Hm. Wonder if it'd be useful to have pared-down scripts start on boot, although losing the nice Apache interface isn't fun [18:54] shaqfu: no no [18:54] forget the warrior if your running low power instance [19:19] Smiley: How much RAM do the basic scripts use? [19:20] shaqfu: dunno tbh, I don't run em [19:22] Then don't say that people should forget it, if you don't have any numbers [19:22] shaqfu: the 'warrior scripts' doesn't use much at all. Most memory will be used by each projects project-code [19:23] like running wget on different target sites. Some tasks are just bigger and needs more memory. It's feasible to do an Raspberry Pi image - it's just that no one has done so. [19:23] ersi: Is the Apache instance terribly RAM-consuming? [19:24] It's not apache. It's python code. It's using the Tornado IO library [19:24] ie. same as project code - if ran stand-alone. [19:24] Gotcha [19:25] Maybe we should continue talking in #warrior :-) [19:27] ersi: I said forget running the warrior instance, that requires whatever minimum vBox wants, which is 64Mb iirc [19:29] Gotcha. Well, the "warrior" is a term for distributed automagic worker. But yeah, forget about running the Open Appliance image on a rasppi automagically :) [20:59] alard: Posterous. I see we're down to about 500,000 left. is that the real number? I'm sorry I keep re-asking this, I just like to be informed before I blather for more help [21:06] SketchCow: No, the real number is higher. Still to add: 985,392 blogs. But more help won't help. [21:06] Hmm. [21:06] So we're just going to lose them [21:07] I mean, unless I really embarass them into extending a month [21:07] Or we have to send more clients to the normal Posterous servers. [21:08] But last time I checked 'our' servers were very busy (10-20 seconds per request). [21:10] ANd these are all not banned, not spam blogs? [21:12] Yes, it's the filtered list. They're still not very good, on average. As you can see on the tracker most of them are empty. [21:13] Posterous: Not very good [21:14] At the moment underscor, k, soult, nwh are using to the normal Posterous entrance, the others go to our back door. [21:17] At some point, as I see it, we have to call it and just zerg rush the servers again [21:17] I'm giving them until tomorrow evening [21:18] If vincent or whoever shows up and says they're keeping them up for us past April 30, let me know, of course. [21:18] (The backdoors) [21:19] Upcoming seems to be blasting along fine. [21:19] time curl http://blog.posterous.com/: normal = 1s, ours = 10s, [21:25] Last statistic for today: 113 / 127 / 142 / 141 / 108 / 118 items per minute on Upcoming. [21:25] (Where items = 25 events.) [21:26] alard / SketchCow : The posterous jobs seem to be taking ~ a million years each. Is there some way I can fix that? (I have a metric ton of throughput here and the warrior's using none of it.) [21:26] Yeah, we're taking that thing apart [21:27] Should be done by Wednesday, I reckon [21:27] ussjoin: The problem, inherently, is we are flooding Posterous. We're killing them. [21:27] Oh? Ah. [21:27] We have a barely workable straightforward gentlemen's agreement with them to get maximum throughput [21:27] But it's their side [21:27] Little we can do. [21:28] Gotcha. So just let the things run, I take it? [21:28] Exciting night: Tonight I will attempt to digitize the first pile of floppies [21:28] Yeah, there's not much else to do there. [21:29] I stopped my posterous jobs on the theory that fewer people banging on it might improve throughput [21:30] Are the formspring jobs de-messed up? I could switch back to those. [21:51] the floppy dumping sounds interesting- what format will they be in, and will there be a way to see what's on them like with the ISOs? [22:03] dashcloud: IA doesn't have a floppy image browser but there's another one online you can use together with IA, hang on [22:05] er damn it's down, was at http://peekbot.jamtronix.com:6502/ [22:07] it makes sense to rip floppies to both an image and a zip or tar of the filesystem [22:16] I will likely rip it to a .dsk image [22:16] And include scans of the floppy, i.e. images [22:16] And then maybe down the line, we write a script to do a grab and rip and dump and zip and upload. [22:17] You're dealing with a person who believes strongly the hardest part is getting it online [22:17] Everything else is cheetos, red bull and daft punk [22:18] There, shoved that in my twitter so you people can't steal it and talk at ted [22:18] Although the people who speak at TED have never had cheetos [22:19] And they hire servants to drink red bull [22:19] put that in your twitter and smoke it [22:19] And they hire daft punk to play at their kid's birthday party [22:19] I love where Notch couldn't get someone a visa to play at his GDC party so he hires skrillex with no warning [22:19] I wonder how much that is. $25k? [22:21] SketchCow: try $1M+ [22:21] Are you using the same source I am [22:22] that's the festival price [22:22] $1M to hire a Corey Feldman look-a-like with a broken synthisizer? [22:22] Festival is not "my GDC party" [22:22] damn I incremented again [22:23] "I am working with Insomniac Events for 2013 booking to bring Electric Daisy Carnival to New York. A general ballpark for a DJ like Armin or Tiesto is $100k-$300k for a 2 hour set depending on the date and show time. Skrillex was only paid $15k for EDC in 2010 and slightly more in 2011. You will be much better off booking a DJ like Laidback Luke, Chuckie, Markus Schulz, Avicii, or someone similar. You are looking at $8k-$15k for those DJ's which you c [22:24] Sounds about right to me. Skrillex might be a Minecraft fan, though? :P [22:24] skrillex is widely more known in 2013 than Armin or Tiesto ever were [22:24] Some of these DJs are surprisingly cheap. [22:24] that's true tho [22:24] I used to work somewhere that had an event with a set from Hextatic. It was ace. [22:25] I could see $100k but maybe just $50k for private party [22:25] There's no money being made off him. [22:25] Somehow this perhaps isn't for #archiveteam [22:25] I ask because when I was doing some floppy dumps, 7zip could read some of them, and provide a listing [22:25] Alright, enough skrillex, back to floppies [22:25] (We had a choice between Hextatic & Hot Chip, for the same price. We were all music nerds, so Hextatic won out, even though Hot Chip would have been the way bigger draw. It ended up being basically a staff party.) [22:25] dashcloud: I am ALL for us trying things with this collection after I stick them in /diskdrives [22:26] sounds good [22:26] So /diskdrives is like the raw set, then curated versions can come later. [22:40] wow- you can still buy an automated floppy duplicator (in fact, probably the same model from back when floppies were big) [22:50] hi [22:50] hello [22:54] Automated dupliactor with a Kryoflux in it would do me good [22:54] I don't trust the process [22:55] yeah :( [22:55] I don't trust myself to get all these disks done before I'm dead [22:55] (and I plan on living a while) [22:55] How many are we talking about? [22:55] probably a small fraction of your load :) [22:55] but still > 1000 [22:56] of course the larger barrier is that a great majority are probably just ordinary game/app images that are everywhere [22:56] and there's really no need to rip the [22:56] m [22:56] But...determining that is a manual and tedious process [22:57] I'm just slamming through things as fast as I can [22:57] I just ran out of spending money, so i have initiative to clean up this fucking house and set [22:57] The one buffer and issue switching to a non-profit. I'm making the amount I made when I was 25 [22:58] We just had flooding here...basement took 2-3" but all ccmp is safe [22:58] SketchCow, after you copy a disk and scan it where does it go? Into the cargo container? [22:58] Yeah, I am absolutely maxed out and then some on space. My thing now is churning through all 3-ring bound documents for scanning [22:59] esp the ones where I don't mind trashing the original docs [22:59] must reclaim shelf space [22:59] Disks will be donated to a number of archives who want them. [22:59] They will keep or dispose of them properly. [22:59] I treat everything leaving here like it's going into a fire. [23:00] (I do offer to donate most docs that are in the discard pile, fwiw) [23:01] I have a massive pile of National Geographics that are going somewhere less than good if I can't find a home. [23:01] That was a mistake, they're in the crate like a tumor. [23:01] Finding them will be difficult but worth it. [23:02] NatGeo is found at every library booksale, garage sale, thrift store, etc... [23:02] They're great mags, tho, especially pre-80s [23:02] both articles and ads [23:02] Under world trade center, stuffed in the international space station, piling up in narnia [23:02] I once found a registration card from the seventies, and got a subscription to NatGeo for $5 per year or something. [23:02] SketchCow: are you set up for magazine scanning? [23:03] Didn't national geographic do a multi dvd release of all their issues [23:03] omf_: If so, that's quite cool [23:03] hope it has ads also [23:03] Don't you have an Internet Archive Scan-A-Ma-Jig? [23:03] Yes [23:03] I've got a growing pile of Datamation magazine, which has been overlooked in the whole magazine scanning scene [23:04] and IMO is an important document of a completely seperate world of computing [23:04] (i.e. mainframe and scientific, vs. home/minicomputer/etc [23:04] ) [23:04] Silent700, check it http://shop.nationalgeographic.com/ngs/category/dvds [23:04] nice [23:04] There's a magazine scanning scene?! I mean, preserving information's important, but wouldn't that just fall into archival? [23:04] Saints alive, the latest segment of the DEFCON Documentary rendered [23:05] Of course there's a magazine scaning scene. For years and years. [23:05] I am not set up to scan mags, at least not non-destructively [23:05] I may see if Bombjack is up for it, tho I believe he cuts them up [23:05] his scans are top-notch, tho [23:05] Silent, don't you have some type of scanner? Can't you just fold them over? [23:05] or would that damage it? [23:06] eh, if I had the next 10 years to do it daily :) [23:06] Nathan's coming in going "Wait, what? There's wireless telephones?" [23:06] so, if you're interested, and have the money, here's the duplicator that probably handled the vast majority of 3.5'' floppies: http://www.awp1.com/ldrprc.html (as mentioned in parts 3 & 4 of this story: http://goughlui.com/?p=3042) [23:06] * Silent700 mulls a KickStarter... [23:06] Hey! [23:06] Tell me more of this horseless carriage [23:07] Why, it smells worse than a horse! [23:07] What is the point! Oats, they grow in the ground! [23:08] it's destroying my upstream, but I don't need it right now - but I'm uploading dozens of new CD-ROMs. [23:08] speaking of old computer magazines, has anyone successfully extracted the scans from Google Books' mags? [23:08] PrntScrn [23:09] Or, Print Screen. Depends on your keyboard [23:09] They have a lot of the old trade mags. I often end up there when researching some obscure product [23:09] but mine just says PrtSc! Does it work the same? [23:09] No. You netbook users' computers will crumble under the heavyweight power of MS Paint. [23:16] Anyway, I don't think you can. Copyright, you see. [23:22] no need for print screen, the book images (like anything else in the dom) show up in firefox's Page Info view