#archiveteam 2011-09-05,Mon

↑back Search

Time Nickname Message
00:00 🔗 Coderjoe is that part of derive now (with a backlog for existing items), or is it generated on requset from a user?
00:02 🔗 underscor Both
00:02 🔗 underscor user requests take priority though
00:02 🔗 underscor Normally the workers can keep up fine
00:02 🔗 underscor the problem was that they all choked on a bad item
00:03 🔗 underscor (so the queue ballooned)
00:11 🔗 SketchCow I alias rm to rm *
00:11 🔗 SketchCow I am that fucking hardcore
00:11 🔗 chronomex don't you mean rm -fr * ?
00:11 🔗 SketchCow No no
00:11 🔗 SketchCow Let's not be crazy
00:15 🔗 SketchCow Man, 26Mb/sec and it will STILL take 4 hours.
00:30 🔗 Wyatt Wow, I never really thought to look at different rm implementations.
00:30 🔗 Wyatt Times like these I love unix so, so very much; this is fascinating stuff.
01:42 🔗 SketchCow Blasting the vk collection into archive.org.
05:05 🔗 Coderjoe goddamn enforced lack of backups
05:06 🔗 Coderjoe uverse dvr crashed. upon coming back up (after a 5-10 minute wait), all recorded programs are gone
05:06 🔗 db48x ouch
05:06 🔗 underscor Gotta love uverse
05:07 🔗 underscor Especially those delicious 2wire routers
05:07 🔗 Coderjoe when it crashed it was just sitting there with a red X and two dots on the screen. really useful error message...
05:45 🔗 SketchCow The derives are happening again!
05:45 🔗 SketchCow Wooooooo
05:46 🔗 SketchCow Not that I won't fix that when I throw another 500 magazines into the hopper
05:46 🔗 SketchCow But that'll be tomorrow
05:46 🔗 SketchCow Also, Star Wars Forums, I need to find out if people have copies of that
05:46 🔗 SketchCow Because I think the disk may have died
05:47 🔗 chronomex alas, alack
05:47 🔗 db48x that was timely
05:47 🔗 SketchCow Yeah, bummer
05:47 🔗 SketchCow Well, depends who uploaded, and if they still have what they had
06:02 🔗 db48x whos hard drive died?
06:02 🔗 * db48x buys a vowel
06:07 🔗 db48x do we know if he couldn't afford a second hard drive, or if he just thought it couldn't happen to him?
07:05 🔗 Coderjoe man... this DigiNotar story just keeps getting better...
07:05 🔗 Coderjoe http://www.computerworld.com/s/article/9219727/Hackers_steal_SSL_certificates_for_CIA_MI6_Mossad
07:07 🔗 db48x yea
07:07 🔗 db48x although that headline is just silly
07:07 🔗 db48x they stole ssl certs with the domain name set to the public-facing webpages of the CIA, MI6 and Mossad, plus five or six hundred other groups
07:10 🔗 chronomex they also got one issued to *.*.com
07:10 🔗 chronomex if that doesn't scare you, I don't know what would
07:12 🔗 db48x it doesn't scare me
07:12 🔗 db48x no client is supposed to honor such a certificate
07:12 🔗 chronomex hmm, ok
07:15 🔗 Coderjoe how about the fact that diginotar was hacked as far back as 2009?
07:15 🔗 Coderjoe http://www.f-secure.com/weblog/archives/00002228.html
07:15 🔗 chronomex that, too, is ridiculous
07:18 🔗 ersi Haha
07:18 🔗 ersi bunch of fucktards
07:19 🔗 chronomex yeah, here's a list: *.*.com and *.*.org are on the list regardless of whether clients care about them http://isc.sans.org/diary/DigiNotar+breach+-+the+story+so+far/11500
07:47 🔗 db48x they were apparently rather lax
07:48 🔗 db48x not sure if all of that "evidence" of various hacks is all legit
07:48 🔗 db48x it'd be too easy to plant
07:49 🔗 db48x the real problem with a CA is that there's no way for them to know for sure how many certificates were issued
07:49 🔗 db48x all someone has to do is steal the private key for their root, and they never have to visit the system again
07:50 🔗 Coderjoe true
07:50 🔗 ersi revoke everything!
07:50 🔗 db48x honestly it looks more like someone wanted them ruined
07:50 🔗 chronomex godaddy next, please
07:51 🔗 db48x heh
07:51 🔗 Coderjoe and makes it harder to revoke, due to the likelihood of duplicate serial numbers
07:51 🔗 db48x impossible to revoke, since they won't know the serial numbers at all
07:51 🔗 Coderjoe that is, to silently revoke
07:52 🔗 ersi revoke Diginotars trust
07:52 🔗 ersi remove rather
07:52 🔗 db48x that's already happened
07:52 🔗 ersi Yeah, I know.
07:52 🔗 Coderjoe no, they won't, until the fake certs show up and are noticed
07:52 🔗 db48x but I think that was the attacker's goal all along
07:52 🔗 db48x to destroy the business rather than to gain fake keys
07:53 🔗 db48x an alternative is that they only wanted to destroy the business once the hack was discovered
07:54 🔗 Coderjoe apparently the group of companies was also involved in public keys for electronic signatures
07:54 🔗 ersi ouch
07:55 🔗 db48x heh
07:55 🔗 chronomex yes. another arm of the business owned the intermediate certificates that were eventually connected to the data in all dutch passports.
07:55 🔗 Coderjoe for the dutch government
07:55 🔗 chronomex s/past tense/present tense/g
07:55 🔗 db48x awesome
07:56 🔗 chronomex the dutch government is currently being very proactive about taking over all these issuing certificates.
07:56 🔗 db48x another suspect emerges
07:56 🔗 Coderjoe fake dutch passports for.. DUNDUNDUNNNN terrorists
12:15 🔗 db48x only about 280 GB of Friendster to upload
12:15 🔗 ersi Enjoy~
12:16 🔗 db48x I really miss having a symetric connection
14:59 🔗 Aranje Coderjoe: You mean the CIA >_> :D
15:00 🔗 Aranje God knows they'd love a supply of truly legit foreign passports
16:02 🔗 tsp damn fanfiction.net to hell
16:04 🔗 SketchCow The hard drive that died was on flophouse.
16:04 🔗 SketchCow Flophouse is, after all, 35 drives
16:11 🔗 db48x tsp: for there robots.txt?
16:12 🔗 db48x SketchCow: ahh, I thought those machines had raid of one variety or another?
16:13 🔗 db48x their
16:14 🔗 tsp I have a problem I'm not quite sure how to solve. I have a lot of fanfiction, ~35gb worth and a bunch more waiting for me to download. I didn't realize this until a few days ago, but some of the stories have been munged with a script that apparently the owners of the website ran and I've been getting this garbage for about a year whenever I update one. If I still have a backup from last
16:14 🔗 tsp year and work off that, I'll be missing a large portion of content and won't be able to get new updates because of this munging. Should I just continue to archive stuff exactly as it comes down from the website and not worry about it?
16:19 🔗 SketchCow These machines did not have raid, hence the move to one that does.
16:25 🔗 tsp Damn, I don't have a backup from that long ago. I don't think I can do anything going forward either, because there's no efficient way to make snapshots of dynamic html
16:29 🔗 SketchCow But you have a lot, right.
16:29 🔗 tsp yeah, ~500k stories and another ~3M on S3 that I need to download and import
16:29 🔗 tsp but I got the millions in December 2010, after the script hit
16:32 🔗 SketchCow You take what you have, set aside as a archive
16:49 🔗 SketchCow I have three Atari Connection magazines, are you interested in buying them? The issues are Summer 82, Summer 84 and Summer 81. The issues are complete the summer 84 cover is torn and the summer 81 cover has fallen off but I have it, all three are in used condition. I am asking $25 for all three. I also have a Learning to Use Microsoft QuickBASIC by Microsoft from 1988 that I would sell for $15 and a Epson Equity GW-BASIC Reference Manual from
16:53 🔗 db48x tsp: how is it munged?
16:53 🔗 tsp db48x: swearing is replaced with *. Not all of it, and not in all stories. It also gets peoples names
16:54 🔗 tsp Whatever it is seems to have been turned off now, but it still affects some stories
16:54 🔗 db48x awesome, bowlderization by a computer program is the worst
16:55 🔗 underscor :D
16:55 🔗 underscor clbuttic problem?
16:55 🔗 tsp I have a feeling they did it irreversably, else later chapters from the same story would have the same thing applied to them.
16:56 🔗 db48x if you view the story on the page today, do the older chapters have the problem?
16:56 🔗 tsp I haven't checked, I don't keep history. When an update comes out, I just download it
16:57 🔗 underscor next step, write an algorithm that reads the sentence, determines context, and chooses an appropriate swearword
16:57 🔗 underscor Or just replace everything with "fuck"
16:57 🔗 db48x lol, no
16:57 🔗 tsp The updates were so long ago I can't really find that out now. In some of them I can obviously replace a** with ass, but I don't want to do anything to the content now
16:57 🔗 db48x no point in making it worse
16:57 🔗 underscor I'm gonna kick your *** => I'm gonna kick your FUCK
16:58 🔗 underscor A girl dog is called a ***** => A girl dog is called a FUCK
16:58 🔗 underscor :D
16:58 🔗 underscor A bundle of sticks is a ******* => A bundle of sticks is a FUCK
16:58 🔗 underscor See
16:58 🔗 underscor It works for all contexts
17:00 🔗 tsp there's no pattern to it at all, just randomly munges stories, but I'm looking at it a year on and not when it was happening. I guess I'll just keep archiving though
17:01 🔗 underscor tsp: How long have you been archiving ff.net?!
17:01 🔗 tsp I didn't find anything in their TOS either about having the right to modify user data like that, but I don't speak legalese
17:01 🔗 db48x as annoying as it is, if it represents the state of the site at the time then it's cool
17:01 🔗 tsp underscor: dunno... 2007 or so
17:01 🔗 db48x just records for all time that they were idiots
17:02 🔗 tsp but not the complete thing until december 2010, and I still haven't got that here it's just on s3
17:03 🔗 underscor tsp: oh, cool
17:03 🔗 underscor So you have a complete copy?
17:03 🔗 tsp more or less, I didn't redownload stuff I already had
17:04 🔗 underscor oh okay
17:04 🔗 underscor oh man, I totally love interstitial ads
17:04 🔗 tsp I'm going to take my dbs folder as it is right now on my backup drive and clone it, along with the ~2800 updates that I just imported. Then continue archiving
17:05 🔗 tsp I saved them because of this munging
17:05 🔗 underscor Any company that purchases an interstitial instantly goes on my blacklist of places-I-won't-buy-from
17:05 🔗 tsp Are they that bad?
17:05 🔗 tsp my screen reader ignores most ads
17:05 🔗 underscor Hijacking the page you're on to redirect you to an ad you can't skip?
17:05 🔗 underscor Oh, I see
17:06 🔗 tsp if they redirect though, I'd notice
17:06 🔗 underscor I think it's some shady floating div thing
17:06 🔗 underscor because the URL doesn't change, but it still blocks content until it's done trying to sell you things
17:07 🔗 tsp right, if this drive dies, my ff-sep2 backup is going to be toast
17:07 🔗 db48x :(
17:08 🔗 tsp I don't have the space to put it on my main one
17:08 🔗 db48x may I offer you a place to upload?
17:08 🔗 tsp my bandwidth will go through the roof if I do that, we have crazy caps over here in canada
17:09 🔗 db48x :(
17:09 🔗 underscor tsp: Sneakernet! :D
17:09 🔗 db48x how big is it?
17:09 🔗 tsp 35gb
17:10 🔗 tsp when I download the 40gb on s3 that'll be even bigger, but that won't go into my sep2 backup of course
17:10 🔗 underscor Hmm, I wonder if I can find my 64GB flash drive
17:10 🔗 db48x right, you said that
17:10 🔗 underscor That wouldn't be too expensive to mail back and forth
17:10 🔗 underscor heh
17:10 🔗 db48x underscor: your thoughts mirror my own to a strange degree
17:11 🔗 underscor db48x: :D
17:11 🔗 underscor Great minds think alike! ;)
17:11 🔗 tsp my backup doesn't have all the recent updates, that's why I'm backing it up. I'll run the updates again, probably another gb or so by now
17:11 🔗 tsp but that's just the stuff I have, this is complicated
17:11 🔗 underscor You have scripts that do the work for you, or what?
17:12 🔗 tsp yep
17:12 🔗 tsp bunch of python script
17:12 🔗 tsp python scripts and edbrowse macros
17:14 🔗 db48x WD sells a nice portable usb hard drive that ships well
17:14 🔗 db48x 1TB for $110
17:15 🔗 underscor I usually just ship naked
17:16 🔗 db48x yea, you can save $20-30 that way, if you want to crack open a computer and plug in a drive
17:17 🔗 db48x for something that you're going to have for a few days and then ship, the usb interface is a plus
17:18 🔗 tsp rsyncing hundreds of thousands of files really takes a while
17:19 🔗 ersi Yeah.
17:19 🔗 underscor I just have like 30 of these laying around http://www.amazon.com/Drive-Adapter-Converter-Optical-External/dp/B001OORMVQ/ref=sr_1_2?ie=UTF8&qid=1315243134&sr=8-2
17:20 🔗 db48x underscor: :)
17:22 🔗 tsp I've also got tarballs of random fanfiction sites I've come across, if I can get them easily
17:22 🔗 db48x tsp: so, when will you have a raid setup of some kind? preferably zfs...
17:23 🔗 tsp What oses can do zfs these days?
17:26 🔗 db48x linux (to one degree or another), bsd, solaris
17:27 🔗 ersi I wouldn't count linux.
17:28 🔗 ersi I'd say FreeBSD and Solaris.
17:28 🔗 db48x yea, it's a bit early unless you want to debug the occasional crash
17:28 🔗 db48x or slow to glacial pace when you hit 90% full
17:30 🔗 db48x I think OpenIndiana is currently the way to go on the solaris side of things
17:37 🔗 tsp I want to try to get adultfanfiction.net as well, but that's going to be a bitch
17:43 🔗 db48x haven't seen that one. why will it be difficult?
17:44 🔗 SketchCow 800 items in the derive queue, 178 are me
17:44 🔗 SketchCow Nowhere near out of magazines to ingest.
17:44 🔗 underscor :D
17:44 🔗 underscor SketchCow: I <3 Filling Derive Queues
17:47 🔗 underscor http://passphra.se/ :D
18:00 🔗 SketchCow http://www.vintagecomputing.com/wp-content/images/retroscan/flight_simulator_sept11_large.jpg
18:00 🔗 db48x SketchCow: awesome
18:01 🔗 Aranje oh damn
18:01 🔗 db48x well, there's that footer at the bottom, but still cool
18:04 🔗 underscor Wow
18:06 🔗 chronomex haha
18:06 🔗 chronomex oh yeah, we're sneaking up on National Freedom Day or whatever, aren't we?
18:12 🔗 tsp ntfs is going to hate me for creating another 600000 files on it
18:25 🔗 SketchCow We had a HUGE debate about the footer
18:26 🔗 SketchCow Oh man that takes me back
18:26 🔗 SketchCow Cleaning girlfriend's apartment, no time, but ask later.
18:29 🔗 underscor SketchCow: I remember that story
18:30 🔗 chronomex I don't.
18:44 🔗 db48x SketchCow: at least it's not a watermark :)
18:48 🔗 db48x hrm
18:48 🔗 db48x my rsync has slowed to only 500 kB/s
18:57 🔗 db48x LOL
18:58 🔗 db48x http://blog.gerv.net/2011/09/build-tool-name-shortage/
19:04 🔗 chronomex hahahah
19:13 🔗 ersi 20:37 < ZrX-oMs> http://www.happyplace.com/3701/guy-photoshops-justin-bieber-face-into-coworkers-entire-stock-photo-library
19:13 🔗 ersi 20:46 < jelly-hme> dude didn't have backups? eh
19:14 🔗 db48x lol
19:15 🔗 chronomex hahahahaha
19:15 🔗 ersi I thought, since you linked something funny.. so should I
19:29 🔗 * db48x yawns
19:29 🔗 db48x bedtime
19:29 🔗 db48x after a fashion
19:35 🔗 SketchCow Hey.
19:35 🔗 SketchCow Short form
19:35 🔗 SketchCow Ben would watermark
19:35 🔗 SketchCow I criticized him
19:35 🔗 SketchCow He got angry
19:35 🔗 SketchCow I turned it up
19:35 🔗 SketchCow He got MORE angry
19:35 🔗 SketchCow I turned it far up enough to temper steel
19:36 🔗 SketchCow Ben relented, realized his flaws
19:36 🔗 SketchCow Changed it up
19:36 🔗 SketchCow Big fan of me now
19:36 🔗 SketchCow I'm a big fan of him
19:36 🔗 SketchCow hugz
19:36 🔗 SketchCow There's some great weblog entries
19:36 🔗 SketchCow So he doesn't watermark.
19:36 🔗 SketchCow he adds that border to random sides, image is untouched.
19:36 🔗 SketchCow If you want it, cut out the border
19:36 🔗 SketchCow Otherwise, it stays
19:36 🔗 SketchCow see
19:36 🔗 SketchCow I can deal
19:38 🔗 chronomex that's the right way
22:13 🔗 SketchCow http://batcave.textfiles.com/defcon/
22:13 🔗 SketchCow Will be on youtube and vimeo in an hour.

irclogger-viewer