#archiveteam 2013-03-23,Sat

โ†‘back Search

Time Nickname Message
03:41 ๐Ÿ”— offby1 So, I just asked Jason about this; is there a warrior AMI for EC2?
03:42 ๐Ÿ”— offby1 I'm not really interested in killing my home bandwidth, but I'm willing to drop a bit of Amazon's on it :)
03:45 ๐Ÿ”— Rexxar offby1: ask in #BurnTheMessenger, they were talking about that earlier
06:48 ๐Ÿ”— duggan I've thrown together a spot-instance friendly AMI for Yahoo Messages: ami-2400984d
06:48 ๐Ÿ”— duggan just pop your username in for the userdata and Amazon takes care of the rest.
12:32 ๐Ÿ”— viseratop hey all--is there a public AMI for Yahoo Messages yet?
12:33 ๐Ÿ”— GLaDOS Yeah, there is.
12:33 ๐Ÿ”— GLaDOS http://scr.glados.me/1364040894.png
12:33 ๐Ÿ”— GLaDOS Wait no
12:33 ๐Ÿ”— GLaDOS ami-2400984d
12:34 ๐Ÿ”— GLaDOS Put your username (alphanumeric, no spaces) in the userdata field
12:34 ๐Ÿ”— viseratop GLaDOS: great, just heard the sirens, will fire 'er up.
12:34 ๐Ÿ”— viseratop perfect!
12:35 ๐Ÿ”— GLaDOS See that number on the tracker of "total items"?
12:35 ๐Ÿ”— GLaDOS That's still increasing
12:36 ๐Ÿ”— GLaDOS We're discovering items while scraping as well.
12:36 ๐Ÿ”— viseratop yikes, I figured as much
12:36 ๐Ÿ”— GLaDOS So, we're going to need all we can get.
12:36 ๐Ÿ”— viseratop cool, I'm definitely down to throw some $$$ on the fire
12:36 ๐Ÿ”— viseratop it's an insane cut-off date and an important public record, so ridiculous that they'd pull the plug so fast, but not surprising
12:37 ๐Ÿ”— viseratop (on top of rate limiting, grr)
12:37 ๐Ÿ”— * GLaDOS pushes viseratop into #BurnTheMessenger
13:18 ๐Ÿ”— brayden oh yay.. oh fantastic. Google getting rid of Google Reader.
13:26 ๐Ÿ”— sylvar Howdy, if Yahoo is rate-limiting all 6 of my instances, should I shut down and come back with just 1 or is it going to be more efficient to have some other number trying?
13:27 ๐Ÿ”— sylvar ...I'll ask in #BurnTheMessenger
14:11 ๐Ÿ”— BoomBox Hi, i was hoping to use Archive Team for Posterous
14:11 ๐Ÿ”— BoomBox and the web panel says to ask in here prior to running, so that's what i'm doing..
14:13 ๐Ÿ”— GLaDOS BoomBox: If you're fine with the possibility of being temporarily banned from it, go ahead.
14:13 ๐Ÿ”— GLaDOS Otherwise, stahp.
14:14 ๐Ÿ”— BoomBox GLaDOS, assuming you're not a bot attempting to test for the rest of eternity, how long would the temp ban last for?
14:15 ๐Ÿ”— GLaDOS BoomBox: The ban time varies. It used to be over a week. Shortened to about 4 hours when we started archiving.
14:15 ๐Ÿ”— GLaDOS So, lets say about a day possibly.
14:15 ๐Ÿ”— BoomBox aah
14:16 ๐Ÿ”— BoomBox completely fine with the possibility then
14:16 ๐Ÿ”— GLaDOS Well then, go ahead!
14:16 ๐Ÿ”— GLaDOS Grab the fucker.
14:16 ๐Ÿ”— BoomBox :D
14:38 ๐Ÿ”— alard Shall I remove that warning? (Or at least make it less threatening?) There's no such warning for Yahoo Messages.
14:39 ๐Ÿ”— alard Never mind, I've removed it.
14:42 ๐Ÿ”— alard http://warriorhq.archiveteam.org/
14:46 ๐Ÿ”— alard We still need one in Africa.
14:46 ๐Ÿ”— BoomBox alard, could potentially use "You have the risk of possibly being temporiarily banned from it, feel free to venture. If you have any questions, ...."
14:48 ๐Ÿ”— alard BoomBox: Yes. Although you shouldn't get banned by Posterous at the moment. The warriors are contacting a special Posterous server that doesn't ban.
14:58 ๐Ÿ”— ersi Whoa, almost 300 warriors in the wild
15:02 ๐Ÿ”— duggan expand the leaderboards, add "teams" (the whole folding@home vibe), and there could be 100,000.
15:28 ๐Ÿ”— offby1 Yeah, I'd say that the dealmaker for me is the EC2 AMI
15:29 ๐Ÿ”— offby1 I tried to get the generic AMI that GLaDOS suggested last night working, but I could never connect to it to see if it was working, and it seems to be gone now.
15:29 ๐Ÿ”— offby1 But if there's a multi-project capable AMI out there, I'd be interested.
15:29 ๐Ÿ”— offby1 I think I could probably make one.
15:29 ๐Ÿ”— offby1 butรขย€ยฆ time and all that shit :)
15:29 ๐Ÿ”— GLaDOS There's an AMI in N.Virginia
15:29 ๐Ÿ”— offby1 Yep
15:29 ๐Ÿ”— offby1 but it's yahoo-only
15:30 ๐Ÿ”— offby1 I'm spot requesting it right now
15:30 ๐Ÿ”— GLaDOS Ah
15:30 ๐Ÿ”— GLaDOS There should be an AMI called i386-debian-squeeze-warrior or something
15:30 ๐Ÿ”— offby1 I tried that briefly last night
15:31 ๐Ÿ”— offby1 this morning, I could not find it again, even by ID
15:31 ๐Ÿ”— GLaDOS ..huh
15:31 ๐Ÿ”— offby1 At any rate, this seems like a great workload for EC2 :)
15:32 ๐Ÿ”— GLaDOS I've got 100 spot requests and 20 instances running the AMI
15:32 ๐Ÿ”— offby1 Weird.
15:32 ๐Ÿ”— GLaDOS STRAIGHT TO THE MOON
15:32 ๐Ÿ”— GLaDOS Also, straight to the #BurnTheMessenger before Cameron_D sees us
15:45 ๐Ÿ”— gevmage Newb question: can I run the Archive Warrior in a virtualbox on an already virtual machine? (I have a virtual web server on vps.net with space)
15:46 ๐Ÿ”— GLaDOS You can, I believe.
15:46 ๐Ÿ”— GLaDOS Just need to foward the ports
15:48 ๐Ÿ”— gevmage You mean ssh-forward the ports? I don't want to run the warrior on my local machine; I want to run it on the virtual server and store the data there.
15:49 ๐Ÿ”— gevmage (I'll ALSO run it locally, but I have less bandwidth and space than it does)
15:49 ๐Ÿ”— GLaDOS As in, have the virtual appliance forward the internal ports.
15:51 ๐Ÿ”— gevmage Hmm...ok, I'll try to set it up and figure it out when I get to that point.
16:29 ๐Ÿ”— tcv_ I'm trying to archive a WordPress website with the gravatar stuff installed. My wget commands are making the .HTML files balloon to a dozen or more megabytes each. Anyone have any suggestions on how to skip the dynamic gravatar stuff?
16:32 ๐Ÿ”— Schbirid i downloaded the warrior. virtualbox. in the browser i looked at the projects and first clicked on posterous and then yahoo. now the "available projects" page tells me posterous is running, yahoo is not. "current project" page however shows yahoo running
16:33 ๐Ÿ”— Schbirid being able to limit the upload rate would be nice
16:33 ๐Ÿ”— soultcer Schbirid: Because it will first finish the current tasks before switching to the new project
16:34 ๐Ÿ”— gevmage @GlaDOS, when you were talking about forwarding ports, were you talking about setting up port 8001 back to my local machine so that I can configure it?
16:35 ๐Ÿ”— GLaDOS gevmage: Yeah
16:35 ๐Ÿ”— GLaDOS Wait, what
16:35 ๐Ÿ”— GLaDOS I mean, allowing HTTP access into the VM over port 8001
16:35 ๐Ÿ”— GLaDOS What virtualization software are you using for the warrior, virtualbox?
16:35 ๐Ÿ”— gevmage Yes.
16:35 ๐Ÿ”— gevmage Yes, virtualbox.
16:36 ๐Ÿ”— gevmage Is that in "settings" before I launch the appliance?
16:36 ๐Ÿ”— GLaDOS Wait, are you using it on windows?
16:36 ๐Ÿ”— GLaDOS If so, it automatically does it
16:36 ๐Ÿ”— gevmage I don't do windows.
16:37 ๐Ÿ”— GLaDOS Phew
16:37 ๐Ÿ”— gevmage local machine is Ubuntu
16:37 ๐Ÿ”— gevmage outer virtual web server is Debian of some variety.
16:37 ๐Ÿ”— GLaDOS But yeah, I'm tired beyone the point of helping
16:37 ๐Ÿ”— * GLaDOS hurls soultcer at gevmage
16:38 ๐Ÿ”— soultcer If you imported the Warrior OVA it automatically sets up the port forwarding
16:40 ๐Ÿ”— gevmage Ok. So far my status is ova downloaded, unpacked. package-manipulation done so that virtualbox does't complain about not having a drive. When I ran it it seemed to stop at 20%.
16:40 ๐Ÿ”— gevmage But reading through the instructions, I have to get to port 8001 on the remote machine.
16:40 ๐Ÿ”— gevmage So I'll have to ssh forward that port, I guess.
16:42 ๐Ÿ”— gevmage Ah, and now I see it's complaining about not having enough RAM. I'll sort that out.
17:13 ๐Ÿ”— bowman__ hi all
17:13 ๐Ÿ”— bowman__ is there any way to run the software without virtualbox, e.g. on a vserver?
17:16 ๐Ÿ”— duggan heya, bowman__ - yes, if you just want to run the yahoo job you can use the instructions here (for linuxes)
17:16 ๐Ÿ”— duggan http://pastebin.com/dJYURk6m
17:16 ๐Ÿ”— bowman__ duggan: cool, I'll take a look
17:24 ๐Ÿ”— gevmage Ok, another newb question. Do I have to run virtualbox as root?
17:25 ๐Ÿ”— Schbirid not unless you want to use a low port number
17:26 ๐Ÿ”— gevmage Oh. Hm. Twice when I've tried to run it, I got: Message from syslogd@openclasses at Mar 23 17:22:38 ... kernel:[901697.913991] general protection fault: 0000 [#1] SMP
17:27 ๐Ÿ”— gevmage How long does it typically take to start up?
17:30 ๐Ÿ”— bowman__ on my box, it only takes a couple of seconds
17:58 ๐Ÿ”— gevmage Oh, well then it's borked.
17:58 ๐Ÿ”— gevmage I'm trying to run it virtual machine within virtual machine.
17:58 ๐Ÿ”— gevmage It seems to keep running (in top) but it stops making progress that I can see.
17:58 ๐Ÿ”— gevmage That's why I was trying to run the other instructions.
18:00 ๐Ÿ”— chazchaz Is it possible to use rin-pipeline to run two different projects on the same box?
18:00 ๐Ÿ”— chazchaz I'm getting a socket.error: [Errno 98] Address already in use
18:00 ๐Ÿ”— chazchaz error
18:01 ๐Ÿ”— chazchaz http://paste.ee/p/UAYQk
18:02 ๐Ÿ”— soultcer chazchaz: run-pipeline starts a webserver, but you can either disable it with --disable-web-server or change the port with --port
18:03 ๐Ÿ”— chazchaz ahh. didn't know that. WHat does the web server show?
18:04 ๐Ÿ”— soultcer Logs from the process
20:11 ๐Ÿ”— neurophyr http://www.fas.org/blog/secrecy/2013/03/ntrs_dark.html
20:11 ๐Ÿ”— neurophyr it's not just private entities that are unreliable stewards of digital history.
20:12 ๐Ÿ”— neurophyr um. or the people complaining about it. google cache of the above: http://v.gd/UQW3Bh
20:13 ๐Ÿ”— neurophyr "In other words, all NASA technical documents, no matter how voluminous and valuable they are, should cease to be publicly available in order to prevent the continued disclosure of any restricted documents, no matter how limited or insignificant they may be."
20:16 ๐Ÿ”— DFJustin https://archive.org/details/nasa_techdocs
20:18 ๐Ÿ”— bowman__ it's crazy how the Yahoo leaderboard is going nuts by now :)
20:19 ๐Ÿ”— neurophyr DFJustin: that's good.
20:37 ๐Ÿ”— wp494 I thought the thing was going to be a lost cause before ycombinator had that link to a blog posting
21:30 ๐Ÿ”— opsec The photos/info @ http://www.skyscrapercity.com might be worth archiving. The site isn't closing anytime soon but all the photos are hotlinked and more of them disappear each day. I've personally been keeping a mirror of the just the Philippines sections with all photos, I can't store it all though (the rest of site)
21:30 ๐Ÿ”— opsec I suspect they are exactly the sort of photos that'll interest people years from now for history... and nicely categorized in forum threads

irclogger-viewer