[00:01] Archive Team: Reduce (encourage sites to not destroy data), Reuse (save data), Recycle (make floppy disk dresses). [00:12] Don't Recycle (keep that data around, don't put other things on the bits) [00:13] SketchCow: Batcave is down [00:13] Well, not accessible [00:13] Possibly related to the network fuckery going on in rwc [00:26] I think fuckery is one of my favorite words [00:29] Me too :D [00:30] gholy fucknuggets [00:30] http://www.proust.com/story/polarcubby [00:30] "1065 entries in 126 chapters" [00:41] the corresponding warc.gz is 48 megs and counting, which in comparison to some things like Splinder journals isn't that huge, but compared to the rest of Proust is farther to the right than Ron Paul [00:43] er wait, no it isn't [00:43] that's what I get for not waiting until sort -rn finished [00:43] silly willy [00:43] http://www.proust.com/story/jesse is the winner so far at 72 megs [01:46] yipdw: good news, i finally got your script to run, and it works, now i just need to automate it [02:02] bsmith094: xargs, write a script, many other ways [02:02] also I don't recommend using that revision [02:02] because it doesn't work [02:03] i know ive got it, thanks so much [02:03] re: broken images and not handling gzipped CSS [02:03] feel free to fix it, though [02:03] we can grep the raw html later [02:03] no [02:03] fix it before you start [02:03] there's no rush [02:03] it's easier to fix before fetch, and fanfiction.net isn't showing death throes [02:04] wget -k should do it [02:05] uh [02:05] -k is not relevant [02:05] in a WARC situation, anyway [02:05] the problem is not unresolvable URLs; it's files that are actually missing and wget not decompressing gzipped CSS [02:06] the latter is fanfiction.net's fault, really [02:06] they're sending gzipped CSS to a client that doesn't state it can handle gzip transfer-encoding [02:06] but whatever, the fetcher needs to handle that [02:29] Moves magazine is coming down - apparently they are still selling it. [02:39] bsmith094: actually, I want to approach the whole thing differently [02:40] k, how? [02:40] there is an assload of duplication that will be done with the current approach [02:40] and it's an assload that should not occur because WARCs can be unioned very easily [02:40] I want to retrieve all story IDs, all member profiles, and get those concurrently [02:41] but making dinner and beating myself up come first, so bbl [02:41] i was also thinking, put the reviews in a reviews folder in the same level as the warc [02:41] You'll go blind! [02:41] Oh, UP [02:41] reviews are specific to a story, so they can go inside the story's WARC [02:42] SketchCow: yeah, Convict Conditioning is awesome [02:42] that works too [02:42] the one-arm push-ups and pull-ups are very asshole though, as the difficulty curve for those increases so much [03:54] SketchCow: have you heard anything about the state of batcave? [03:55] I've heard it's down. [03:55] well yeah, me too, but it interests me to know why [04:48] chronomex: re: recycling - I was working under the assumption that the disks being used were corrupted beyond use, and multiple intact copies were available elsewhere. I imagine it's like finding badly scratched AOL CDs. [05:30] hahaha overdue notice [05:36] ok, so I got WARC ingestion via rsync working on wayback.at.ninjawedding.org [05:36] let me know if you'd like a module to upload shit to [05:36] the thing's running on a 64 GB EBS volume right now, so should be enough space to play with [05:45] Coderjoe: is your redis instance for ff.net stuff still up? I need to reslave [06:22] SketchCow: Oops [06:22] That's cause I activated the service before the payment was complete [06:22] And the "check for unpaid shit" cronjob runs at midnight [06:33] yes [06:33] or should be [06:34] yes [06:37] can you give me host/port info again? [06:48] Coderjoe: ping [06:48] hrm.. sec. [06:48] thanks [06:50] solitude.wegetsignal.org standard redis port [06:51] cool [06:51] SYNC sent [06:51] any traffic? [06:52] oh there we go [06:52] holy fuck that is a lot of bytes [06:52] [20719] 10 Jan 06:52:46 * DB saved on disk [06:52] [26673] 10 Jan 06:52:26 * Background saving started by pid 20719 [06:52] [26673] 10 Jan 06:52:26 * Slave ask for synchronization [06:52] [26673] 10 Jan 06:52:26 * Starting BGSAVE for SYNC [06:52] [26673] 10 Jan 06:52:46 * Background saving terminated with success [06:52] I forgot how much shit was in that db [08:26] hey Bat Out Of Hell [09:14] Gamn, archive.org's WBM has been flaky lately... [09:18] urlte.am is now hosted on a new, doesn't-get-raped service, and provider. [09:19] The WHOIS will reflect the new nameserver, and then update everything properly, over the next couple days. [09:19] Next, I'll be working on archiveteam.org. [09:19] My goal is just to blow out the DB and put it into the new item. [09:19] We might have duped work for a tad. [09:20] If you can't wait: http://199.48.254.80/~urlteam/ [09:23] Rawesome! [09:25] Was/is everything hosted at DH? I mean things like textfiles and other file mirrors... [09:25] What you really ought to do is get fiber to your house and host stuff there, maybe [09:26] Not everything hosted at dreamhost, no. I have many things. [09:26] And no, this house should not be a place to host things. [09:26] Power, cost, it makes no sense [09:30] SketchCow: could you use a mirror for load balancing? [09:31] https://github.com/ArchiveTeam/fujoshi [09:31] most offensive ArchiveTeam repo name ever [09:32] D: [09:32] in novi, mi? i just drove past there twice this weekend [09:33] On that note, I bet most Justin Bieber fans are actually ugly females ni the ages 35-55 [09:35] interesting website... [09:35] vpsdatacenter.com [09:35] aaaactually [09:35] I think I'll pull that repo for now [09:35] the hoard and stalk scripts aren't done at all [09:36] that the ff.net project? [09:36] yeah, I decided to rewrite it as a set of queues [09:36] easiest way I saw to avoid duplicate retrieval of profile datra [09:37] [vpsdatacenter.com] Haha! And no way to sign up! I wonder if the joke is that nobody would buy VPS services from a company like that anyway [09:38] i don't think they sell directly to end users [09:38] or don't sell under that name to end users [09:39] i got the info from ARIN [09:39] wow, view source on that page [09:39] it's awesome [09:39]















[09:39] meta before html [09:39] no [09:40] Yep [09:40] must have been a copy-from-Word job [09:41] It's a good thing the page has a correct (although misplaced) content-encoding declaration, or else the Unicode bullet points would look bad [09:41] nitro2k01: 35-55 are the twilight 'cougars' [09:41] bieber fans actually are young girls [09:42] citation: youtube [09:42] No, the Bieber fans that show it openly are young girls [09:42] That's my theory anyway [09:42] (I hope you can hint a tiny bit of sarcasm in my comment) [09:42] oh. well, the fans of twilight don't keep it a secret [09:42] I was bitten by Robert Pattinson once [09:43] er [09:43] i'm too tired for subtlety :P [09:43] I meant time for bed [09:43] what a typo [09:43] we all make robert pattinsons [09:43] mistakes* [09:43] Curse you, auto correct [09:43] "I want to fuck you in the ass violently until you pass out" [09:44] "Eh" [09:44] "I mean hi mom" [09:44] "Damn autocorrect" [09:44] Robert Pattinson bites can be quite nasty [09:46] who did the Special Robter Pattinson Effects, Robert Pattinson Costumes, and training a Robert Pattinson to mix concrete and sign complicated forms? [10:47] http://i.imgur.com/RFtnt.png [11:16] https://www.lytro.com/living-pictures/2328 [11:16] That still blows me away. [11:47] -- [11:47] http://www.eatliver.com/img/2011/8132.jpg [11:47] Official Archiveteam Helmet [12:27] SketchCow: now all we need to do is order a tonne of bacon, and set some blueprints up. [12:45] http://www.archive.org/details/nbii.gov-20111215-210958 (They grabbed it.) [13:40] At 0:51 I was eating bread. [13:40] At. 0:52 I was eating toast. [13:54] It might be healthy to change your room temperature. [13:55] Did you accidentally put your slice of bread on a P4 heatsink? [14:04] I'm always accidentally putting bread on heatsinks [14:05] What's the difference between a P4 and a P4 mobile? [14:05] A P4 mobile only consumes energy like a *small* city [14:06] But 90W TDP is so much better than 120W! [14:06] yeah I remember the early 2000's also [14:06] if you want to be current, talk about AMD FX lineup [14:06] or how to double tdp and lower performance from your previous generation [14:07] http://7.mshcdn.com/wp-content/uploads/2011/10/facebook-failures.png [14:11] Wow, that's a very colourful montage of failed things. [14:15] please disregard it... put all your trust into facebook [14:15] facebook has your best interests in mind [14:21] http://i.imgur.com/dUGuY.jpg [14:22] haha [14:25] SketchCow: jamendo news "We are currently reprocessing the music to improve the audio quality in streams/downloads as well as the tagging. " [14:25] Huzzah [14:25] :D [14:25] that might mean terrible things since it is jamendo after all [14:26] http://26.media.tumblr.com/tumblr_lpomggEKAJ1qz5tgbo1_500.jpg [14:26] i dont care about nepal and neither should anyone not in nepal [14:26] I'm hitting a big image set [15:08] Wait, does Jamendo have lossless versions of everything stored or something? [15:10] yes [15:11] Wow. [15:30] http://failblog.files.wordpress.com/2011/07/epic-fail-photos-oddly-specific-meanwhile-in-america.jpg [16:10] http://lh4.ggpht.com/_hVOW2U7K4-M/S72QkekpK1I/AAAAAAABR8g/hQiUNTJq5zI/s720/ewer5y.jpg [16:11] I would love to find one of those [16:12] Hmm, I wonder if it was designed to be used while driving [16:13] yeah [16:13] it was. [16:14] I think there were several models/brands of those made [16:14] Now, these were not your standard "lift the arm and place it on a record" type players. Instead these were specially designed units that did their best to keep playing music even when bumps in life came along. But no doubt fiddling with a record in a moving car did little to keep it pristine. [16:14] It might be good to note that the car record player did not play a standard 45 rpm record. While the 7 inch record looked like a 45, it was actually designed to spin around at 16 and two/thirds rpm, much slower than 45. They also used a new format (ultra-microgroove), which allowed for more grooves per inch. [16:18] ULTRA [16:22] ahh, 16. I had turntables that could do it. [17:35] http://www.youtube.com/watch?v=mNLKgHCiGCk [17:38] SketchCow: the lighting looks awesome [17:39] atleast for that second interview with the blonde girl :) [17:39] Yeah, her exposure worked out well. [17:39] I've since found better settings for the camera. [17:39] This is easily the worst and hardest environment I will ever shoot in. [17:39] I've included good and bad exposures, to help with the camera. [17:40] yeah lighting is often the most important issue some documentary makers overlook. [17:40] This is a very good light near them. [17:40] I just had exposure on auto, not manual [17:40] I didn't realize that because it was letting me manipulate settings [17:40] But it wasn't manipulating the good ones! [17:41] The first two interviewees run an arcade, and I'm going to visit it this month! [17:41] I have a Canon EOS 550D (thinks it's known as rebel 2 or something in the US) and with proper lighting/lense it produces amazing results. [17:42] Yeah, this is a 5D MKII [17:42] I suppose that is a pro-model? :) [17:44] yeah, it is [17:45] * yipdw <3 5D mkII [17:45] blonde girl? /me watches [17:45] although it has a really stupid quirk in which it will not trigger a PocketWizard on the hotshoe while in Live View [17:45] how long per card do you get? [17:45] but if you attach a Canon flash to the hotshoe, it's fine [17:45] I don't know what that means [17:46] I'm using a 64gb card. [17:46] SketchCow: are you using that moire filter you linked a while back? [17:47] Roughly an hour. [17:47] At the moment? No [17:48] the lighting on the bearded dude with the Star Wars t-shirt looks really nice too. [17:50] SketchCow: what lense were you using for this test? [17:50] 35mm [17:50] L Series [17:53] I have a 50mm f/1.8 II which is the only one I've used when shooting video. It produces respectable video even in low light conditions but time and money keep me from experimenting further at the moment. [17:53] Got one of those too. [17:53] I have a lot of lenses. [17:56] I got my camera mostly to document my daughter but I have been thinking of dusting off some old scripts and shoot 'em with friends Peter Jackson-80s-style ;) [17:56] Best of luck with your equipment, interviews and whatnot [17:57] Thank [17:57] s [17:59] SketchCow: BTW, I take it you have seen the comment on pouet.net regarding datastorm ;) [18:03] Yes [18:03] Yes, it's so close. [18:03] Only 639 miles/1010 km [18:04] 13 hours including a ferry ride [18:04] I'll just jaunt over [18:04] hehe [18:04] You know what people care about? [18:04] Themselves. [18:05] very true [18:07] that comment on pouet got you worked up huh? [18:07] What? No. [18:07] Dude, when I get worked up, people die [18:07] Dreams die. [18:07] Dreams people haven't HAD yet die [18:07] data on magnetic media die because of the huge EMP you emitt :) [18:08] that might also explain his bread/toast situation from this morning [18:08] That was from a youtube entry [18:08] Dubstep [18:08] 0:52 is when the dirty drop kicks in [18:11] What is about dubstep videos and witty comments? Perhaps people find them funny when they're high... [18:23] that VIDEO HAs a very much VARYING AUDIO LEVel [18:23] but got me excited about the docu [19:03] SketchCow: Back to the test footage; I notice some of the interviews are slightly out of focus. Do you only use the cameras own screen or do you have an external (larger) screen? [19:11] I use the camera's own screen [19:11] that said, there's a zoom feature I did not use [19:11] I got better at the focus after that [19:12] Also, sometimes I'd set up and the person would move around so my focus stopped being right [19:12] Like the girl who leans forward [19:32] Yeah I can imagine that it is extra hard keeping focus in an environment like that. Most of the shots though were crystal clear though, looks promising indeed. [19:33] and if you got most shots right in an environment like that, the sky is the limit. [19:39] That's the idea [19:39] Go "huh, this car seems to drive pretty well" OFF-ROAD INTO SWAMP [19:41] is that car a Bugatti Veyron? [19:41] also, proust data hitting batcave [20:05] whee [20:05] good to see so few of those "I'm leaving GoDaddy over SOPA" threats actually going through [20:10] Coderjoe: It's really futurehosting.com [20:11] VPS Datacenter, LLC is their whitelabel brand [20:15] why is that good, yipdw? [20:16] besides the sopa thing, godaddy is terrible [20:16] so any reason that pushes people to ditch them is great. [20:16] I yanked them all out [20:18] dnova: because it confirms my observation that most people are unwilling to do as they say [20:19] SketchCow: yeah, I noticed, and that's awesome [20:19] yipdw: oh :( [20:19] kudos for followthrough [20:20] but I'm just following up on the redditors and major sites that said they'd pull out, and so far I'm finding that a lot...haven't [20:20] that said, I understand that sites like StackExchange and Wikipedia are going to need more time to migrate [20:20] (if they are) [20:21] and I mean, it's just registrar switching; it's not like committing to five years of hard labor in the Congo [20:21] wikipedia didn't bail? [20:21] they haven't yet [20:21] :( [20:22] I'm looking for a status update [20:24] well, nothing so far, but I'll give it another month