#archiveteam 2011-07-13,Wed

↑back Search

Time Nickname Message
01:48 🔗 TheSIMM Moo.
03:29 🔗 nrr` SketchCow: I have a 1.2M five-and-some-change floppy drive sitting wrapped up in a box. Want it?
04:07 🔗 SketchCow Sure.
06:04 🔗 Coderjoe I'm planning on borrowing my employer's old, no-longer-used 3.5" autoloader to image a bunch of floppies, if I can figure out the control commands (just need to snoop on the serial data the software sends). I'm thinking of imaging the floppy, ejecting it, and then having a camera take a picture of the label, for when I go back through them
06:05 🔗 Coderjoe I also have a 5.25" floppy I need to install and start going through old floppies
06:06 🔗 Coderjoe my parents had a few 8" floppy disks, but I suspect they are in a landfill by now
06:09 🔗 SketchCow LOVING MY JOB
06:09 🔗 SketchCow Kind of hungry, though.
06:10 🔗 SketchCow I went low-carb for July.
06:11 🔗 Coderjoe I might be lucky. a large number of the 5.35" disks are DD rather than HD
06:14 🔗 Coderjoe er, 5.25
06:16 🔗 SketchCow I am putting the scare into people.
06:16 🔗 SketchCow It needs to be done.
06:22 🔗 Tatsujin http://abc80databasen.blinkenshell.org/bilder/ABCdisk.JPG heres my super rig, i'm dumping abc80 ("swedish trash80") disks which i've gotten from our "MIT". If you think it's bad in the US then you should check Sweden, here i'm like the goddamn last of the mohicans when it comes to this sort of stuff ;)
06:23 🔗 Tatsujin i can imagine theres a ongoing information genocide in various small countries that had their own computer systems.
06:34 🔗 SketchCow Yes
06:35 🔗 Coderjoe I wonder if I can get my 520ST working again
07:45 🔗 SketchCow I'll have something neat to show soon.
07:45 🔗 SketchCow Just something I'm doing for work, not archiveteamy
07:45 🔗 SketchCow But preservationy
07:46 🔗 SketchCow We've been given a new archiveteam machine, too
07:46 🔗 SketchCow So that'll be happening
08:07 🔗 Coderjoe_ choopa appears to hate life tonight
11:47 🔗 ex-parrot so... who's got an up to date copy of astronautix.com?
11:47 🔗 ex-parrot I've got a wget -r rip from Jun 2010 if anyone wants it, but I'd love it if someone has a fresher copy
11:48 🔗 ex-parrot archive.org seems to have it from jan 2010...
11:59 🔗 db48x ex-parrot: I have a copy
11:59 🔗 ex-parrot that's cool. glad some other folks have copies now that it seems to be gone :(
12:00 🔗 db48x denial of service?
12:00 🔗 ex-parrot nope, he's pulled it
12:01 🔗 db48x yea, but the page says there was a DOS, so he pulled it
12:01 🔗 ex-parrot I got the impression it was pulled in a "not coming back" kind of way
12:01 🔗 ex-parrot but fingers crossed I guess
12:03 🔗 * ex-parrot requires sleep
12:06 🔗 db48x I'm essentially out of disk space
14:55 🔗 undersco2 So, what projects do we still have going on right now?
14:56 🔗 SketchCow There's that list you gave me, sir
14:57 🔗 SketchCow The WGET/WARC thing has really amazed archive.org
14:57 🔗 SketchCow I'm about to start pulling friendster over to the new machine
14:57 🔗 SketchCow I need to name the new machine
14:58 🔗 SketchCow The new machine has 27 terabytes
14:58 🔗 SketchCow Which sounds like a fraction of the other, but this is a pooled drive, also better maintained/mirrored
14:58 🔗 SketchCow We're using a smaller amount.
14:58 🔗 SketchCow And the other machines are available for video and similar emergencies.
14:59 🔗 undersco2 Aha
14:59 🔗 SketchCow Myspace is now bought out, so I'm not 100% worried of it
14:59 🔗 undersco2 That's good
14:59 🔗 undersco2 Call it deuterium
14:59 🔗 undersco2 :D
14:59 🔗 undersco2 (Heavy hydrogen)
15:00 🔗 undersco2 Me and two buddies here just ordered some from united nuclear, so we can make sinking ice cubes
15:00 🔗 undersco2 Unfortunately, it causes sterility in high concentrations
15:01 🔗 undersco2 SketchCow: I know there's the list, but what is the next priority?
15:02 🔗 SketchCow Wiki
15:02 🔗 SketchCow I'd like the wiki to be better and cleaned
15:02 🔗 undersco2 I'm working on finalizing the starwars forums stuff, but that's mostly autonomous
15:02 🔗 SketchCow I've been working on away from keyboard.
15:02 🔗 undersco2 Since it's sorting like 35k threads and 115 mil profiles
15:02 🔗 undersco2 Can I get admin on the wiki?
15:02 🔗 SketchCow http://www.archive.org/details/awayfromkeyboard
15:03 🔗 undersco2 That collection looks like a really neat idea
15:04 🔗 SketchCow Needs some refinement, but it will be.
15:04 🔗 SketchCow This is related to Len Sassaman
15:06 🔗 undersco2 I see
15:13 🔗 undersco2 Ugh, class is so boring today >:|
15:18 🔗 undersco2 Damn wifi >:|
15:57 🔗 SketchCow Educate yourself
15:57 🔗 SketchCow Class shouldn't be boring
16:15 🔗 SketchCow Fuck, time got away from me today.
16:37 🔗 underscor SketchCow: It is boring, it's just people presenting stuff
16:37 🔗 underscor s/is/was/
16:37 🔗 underscor Now we get to go bowling!
18:24 🔗 SketchCow Bowling!
19:07 🔗 ndurner soultcer: what's the matter with your ggroups discovery instance(s)?
19:22 🔗 SketchCow I'm about to a do a massive purge of shitty users on the archiveteam wiki.
19:26 🔗 SketchCow http://www.archiveteam.org/index.php?title=Special:ListUsers&limit=500
19:26 🔗 SketchCow Watch me purge purge purge, baby
19:27 🔗 Coderjoe_ wow. lots of spammers
19:27 🔗 Coderjoe_ "breast enhancement gum"?
19:28 🔗 swebb1 ndurner: your tracker seems to be working quicker now.
19:28 🔗 swebb1 I've downloaded 44GB in the last few days.
19:29 🔗 swebb1 of google-groups archives.
19:29 🔗 ndurner Good!
19:31 🔗 Coderjoe_ ndurner: out of curiousity, have you made any changes to make things quicker?
19:31 🔗 Coderjoe_ or did things just speed up at google?
19:31 🔗 ndurner not yet.
19:31 🔗 Coderjoe_ hmm
19:32 🔗 ndurner There seems to be an issue on soultcer's side
19:32 🔗 ndurner which means we have more resources available for processing downloads
19:41 🔗 soultcer ndurner: Checking right now
19:42 🔗 soultcer The instances seem to be running fine on a total of 7 vServers
19:44 🔗 ndurner hrm, ok, thanks
19:44 🔗 ndurner We're currently processing 99 dirs/hr. That used to be 10x as much.
19:44 🔗 swebb1 Would it be of any interest to make a generic tracker for these kind of things that anyone could just start using for projects?
19:44 🔗 ndurner yes
19:45 🔗 ndurner I have been thinking of that
19:45 🔗 emijrp tomorrow i will have 50/5 mbit connection at home, 10x my current upload rate
19:45 🔗 swebb1 Hmm. Ok. I'll keep that in mind.
19:45 🔗 emijrp ~600kb/s
19:45 🔗 ndurner One thing that could help is Apache Zookeeper
19:46 🔗 emijrp after some test, i will try to upload Jamendo to Internet Archive
19:46 🔗 ndurner disadvantage: Java
19:47 🔗 SketchCow Can someone help me with the user purge?
19:47 🔗 SketchCow I am going after the obvious ones, of course.
19:47 🔗 SketchCow people called 20393 fuf dslfhlkfjdf buy cialis 20202
19:47 🔗 SketchCow But I think there are other people, with no contributions, who should be purged.
19:50 🔗 emijrp I HAVE GOOD NEWS FOR YOU.
19:50 🔗 emijrp What do you want SketchCow? block them?
19:51 🔗 SketchCow I am merging and deleting.
19:53 🔗 SketchCow I have purged all the users of spammishness, I believe, up to "7"
19:53 🔗 SketchCow In terms of first letter.
19:53 🔗 SketchCow I'm using the 500 view to see what's there.
19:53 🔗 SketchCow It goes down to T.
19:53 🔗 SketchCow I think I can get it under 500, once spam is gone.
19:53 🔗 SketchCow There was a rapefest of spam users in sept. of 2009
19:53 🔗 SketchCow So that's helping.
19:53 🔗 soultcer It appears that I don't have permissions to delete/merge users
19:54 🔗 soultcer Also, why am I excess flooding?
19:54 🔗 SketchCow No, just msg me usernames.
19:55 🔗 soultcer SketchCow: http://archiveteam.org/index.php?title=Special:BlockList also contains users that have been blocked, but not merged and deleted
19:58 🔗 Coderjoe_ I could possibly hack something up to pull the userlist and then spit out a list of people with a 0 editcount
19:59 🔗 Coderjoe_ using the mediawiki api
19:59 🔗 SketchCow OK, all the users fit on one page.
20:04 🔗 SketchCow Userlist coming under control.
20:07 🔗 emijrp he, i developing a tool for statistical analysis of mediawikis
20:07 🔗 emijrp look at this graph of AT wiki http://img232.imageshack.us/img232/6559/usereditsnetwork2.jpg
20:07 🔗 emijrp nodes = users, edges = how many pages were edited by both users
20:11 🔗 emijrp activity http://img703.imageshack.us/img703/5333/archiveteamactivity.jpg
20:12 🔗 SketchCow http://www.archiveteam.org/index.php?title=Special:ListUsers&limit=500
20:12 🔗 SketchCow Much nicer.
20:12 🔗 bbot_ which day is day 0?
20:13 🔗 emijrp good question, looking at the code
20:13 🔗 emijrp i guess sunday
20:14 🔗 bbot_ hmm
20:14 🔗 emijrp yep
20:14 🔗 emijrp %w http://www.somacon.com/p370.php
20:17 🔗 emijrp if you think about other interesting graphs, tell me
20:19 🔗 SketchCow Coderjoe gave me a list of zero contribution users, killing now
20:19 🔗 SketchCow If only this worked at office buildings
20:22 🔗 bbot_ ha ha vat a bloodbath
20:24 🔗 SketchCow OK! Totally cleaned.
20:24 🔗 SketchCow http://www.archiveteam.org/index.php?title=Special:ListUsers&limit=500
20:25 🔗 SketchCow Probably a few assy ones stuck in there but we no longer look totally owned
20:27 🔗 emijrp congrats
20:27 🔗 SketchCow Great job, thanks Coderjoe, that helped a ton.
20:27 🔗 SketchCow If you guys still stumble on some assery, let me know.
20:27 🔗 SketchCow Pages or other things.
20:28 🔗 emijrp i think recaptcha scares spambots
20:29 🔗 emijrp and disturb legit users : P
20:29 🔗 emijrp i dont see by being an admin, but i remember while trying to post links, enter a captcha a tiem
20:30 🔗 Coderjoe_ i've had some weird recaptchas before
20:31 🔗 Coderjoe like http://img411.imageshack.us/img411/9711/recaptcawhat.png and http://img839.imageshack.us/img839/1474/recaptchawha.jpg
20:33 🔗 emijrp he
20:33 🔗 emijrp look this http://img844.imageshack.us/img844/6449/smostresilientbittorren.png
20:34 🔗 emijrp close to this one http://www.gully.org/~mackys/lj/captcha-reimann.png
20:35 🔗 SketchCow Oh jesus, my floppy post caught fire
20:35 🔗 SketchCow and then the FIRE CAUGHT FIRE
20:35 🔗 SketchCow It's on Digg
20:35 🔗 SketchCow jwz has said something
20:35 🔗 SketchCow It's crazy
20:50 🔗 emijrp no description, 199mb .doc, IA has weird stuff http://www.archive.org/details/wetyuaw964839
20:52 🔗 emijrp another one http://www.archive.org/details/uaiwtybiway396793
20:52 🔗 emijrp i think they are not books, but chunks of a large file
20:52 🔗 emijrp movies/warez?
20:52 🔗 emijrp look at the download counter
20:53 🔗 emijrp 7000+
20:53 🔗 emijrp yep http://taifon.net/video/saymove/5f86728f48060d39/
20:53 🔗 emijrp fuckers
20:53 🔗 soultcer The second file you linked is a mpeg file renamed to .doc, I assume the other one is a mpeg file too
20:57 🔗 emijrp another video as doc, 55,000+ downloads http://www.archive.org/details/account-text0023es
20:58 🔗 emijrp there is no report link
20:58 🔗 bbot_ abuse@archive.org?
20:59 🔗 SketchCow Yeah, this is a known problem and archive.org works hard to....
20:59 🔗 * SketchCow keeps watching the anime
20:59 🔗 SketchCow WILL SHE CONFESS HER LOVE TO HIM????
20:59 🔗 SketchCow WILL HE KNOW BEFORE THE SAKURA FESTIVAL??????
20:59 🔗 * SketchCow hugs his otaku pillow
21:00 🔗 soultcer I bet there is no Dragonball anime on IA archive servers. I am pretty sure one fight would take enough storage to fill a whole internet archive rack.
21:02 🔗 SketchCow I just let archive.org know about the animes.
21:02 🔗 SketchCow It's an ongoing problem.
21:02 🔗 emijrp did you hear about Captain Tsubasa matchs?
21:03 🔗 emijrp SETI@home is a project to recode Captain Tsubasa episodes to Xvid.
21:04 🔗 emijrp SketchCow: request a report link
21:04 🔗 soultcer hehe
21:05 🔗 SketchCow info@archive.org
21:05 🔗 SketchCow It would be best if you just compiled up a list instead of 3,000 letters
21:06 🔗 emijrp Man, look all that wionywioyuwowiwrionuwprnpy random on http://www.archive.org/search.php?query=%28collection%3Atexts%20OR%20mediatype%3Atexts%29%20AND%20-mediatype%3Acollection&sort=-week
21:08 🔗 emijrp http://www.google.es/#sclient=psy&hl=es&safe=off&source=hp&q=%2Bdoc+%2B199.8M+site:http%3A%2F%2Fwww.archive.org%2Fdetails%2F&aq=f&aqi=&aql=&oq=&pbx=1&bav=on.2,or.r_gc.r_pw.&fp=4036e66a30b4edf1&biw=1320&bih=600
21:10 🔗 emijrp Internet Archive may offer a full dump of their metadata, to scan all this shit.
21:15 🔗 emijrp OK, e-mail sent.
21:54 🔗 chronomex go jason go http://www.archiveteam.org/index.php?title=Special:RecentChanges
21:54 🔗 chronomex oh wait it was scrolled up
21:58 🔗 DFJustin not sure why they bother when you can just straight up upload anime and nobody catches it http://www.archive.org/details/DeadmanWonderland
21:58 🔗 chronomex for those who may think I'm dead, I'm actually working on a project involving scanning and sharing about a half-million pages of ring-bound documents.
21:59 🔗 chronomex It's a good fraction of these: http://en.wikipedia.org/wiki/Bell_System_Practices
22:00 🔗 Coderjoe ooh
22:00 🔗 Coderjoe DFJustin: hahah
22:01 🔗 chronomex figuring out how to separate and metadataify documents scanned from the same hopper has been interesting. we're choosing to go with separator pages with fill-in-the-dots "this next document is doc.nr:" spaces, and bingo-marker the shit out of it
22:03 🔗 Coderjoe how are you identifying those pages so the processing software sees them, in the (rather odd) case that some document has something that looks similar
22:04 🔗 SketchCow I fucking love the BSPs
22:04 🔗 chronomex going to be some sort of registration/identification marks (maybe a qr code?), and also a thick black rectangle enclosing all the metadata marks
22:04 🔗 SketchCow I had a strike manual stolen from a CO
22:04 🔗 SketchCow had to return it to the stealer, he was caught
22:04 🔗 SketchCow But I was transcribing
22:05 🔗 chronomex SketchCow: nice. current hosts of these documents have graciously agreed to share these scans on archive.org.
22:05 🔗 chronomex we have four pallets worth.
22:05 🔗 SketchCow The strike manual is the best, it tells you how to barracade the CO and how much food to buy
22:05 🔗 SketchCow !
22:05 🔗 SketchCow Let me know if you need me to facilitate the collection
22:05 🔗 chronomex hahahaha, do you know the document number? I can jump it up in the queue :)
22:05 🔗 chronomex I totally would appreciate that, I'm estimating this will be about 1T of imagery.
22:06 🔗 SketchCow http://pdfs.telephonearchive.com/bsps/
22:06 🔗 SketchCow Then yes, I am your guy.
22:07 🔗 SketchCow We'll make it happen.
22:07 🔗 SketchCow Do it in e-mail.
22:07 🔗 SketchCow jscott@archive.org
22:07 🔗 chronomex we actually intend to scan every page of ring-bound paper in the telephone museum, including operation and mtce manuals for the panel switch
22:07 🔗 chronomex ok
22:08 🔗 chronomex grand, don't expect anything to be done immediately. we did get a scanner this week though.
22:08 🔗 SketchCow Do you want to wait until the new DIY book scanner is finished, and I can send that in.
22:09 🔗 chronomex we're removing these from the ring binders and putting them through a sheetfeeder; they're all looseleaf.
22:09 🔗 chronomex is that from Dan?
22:10 🔗 chronomex DIY Bookscanner Dan's friend Andy <http: afiler.com /> is working with me on this
22:10 🔗 SketchCow Yeah
22:10 🔗 Coderjoe I kinda wish I hadn't thrown out my boxes of old computer shoppers something like 9 years ago. these were the old telephone-book-thick ones
22:10 🔗 SketchCow Yeah, those computer shoppers need scanning.
22:10 🔗 SketchCow Oh, don't worry, we'll get it all!
22:10 🔗 chronomex :D
22:10 🔗 SketchCow I need more metadata warriors, I need them constantly.
22:11 🔗 SketchCow The current batch is doing well, always worried about burnout.
22:11 🔗 SketchCow But come on, arcade manuals!
22:11 🔗 chronomex BSP project will generate a fuckload of nearly metadataless scans; I'm considering outsourcing to mechanical turk
22:12 🔗 SketchCow There's an argument for this.
22:13 🔗 SketchCow Cost is an issue.
22:13 🔗 chronomex I'm also working on scanning and OCRing the code for the #3 ESS: https://plus.google.com/118060174030033503719/posts/Hi7J7hfpCsv
22:13 🔗 chronomex Yeah. I'm willing to pay for some, and the museum does have some budget.
22:13 🔗 SketchCow I'd love to discuss this, but I have to go. Birthday party. On a boat. In NYC.
22:13 🔗 chronomex farewell; have fun!
22:14 🔗 SketchCow And I'm 1.5 hours north, and 45 minutes until party starts.
22:14 🔗 chronomex go!
22:14 🔗 SketchCow But yeah, bring me in on this, you'll get a collection, etc.
22:14 🔗 chronomex rad
22:20 🔗 NovaKing SketchCow
22:20 🔗 NovaKing did you want zoink.it archive?
22:31 🔗 Coderjoe woah
22:31 🔗 Coderjoe http://www.kryoflux.com/
22:55 🔗 DoubleJ Well, it's a good thing I bought a 3-pack of motherboards off eBay last time -- looks like that storm fried the old box's moboard.
22:55 🔗 DoubleJ 'Cause what I wanted to do this weekend was drive out to MicroCenter for thermal paste.
22:55 🔗 DoubleJ Stupid lightning.
22:56 🔗 Coderjoe At least you have a microcenter nearby. If I wanted to go to microcenter, I would have to drive 4-8 hours round-trip depending on which store and traffic conditions.
22:57 🔗 DoubleJ True 'nuff. Only a half-hour out of my way, but still an annoyance.
22:57 🔗 Coderjoe there are ratshacks and small computer stores nearby, though
22:58 🔗 DoubleJ Hm. Would a Radio Shack carry that? It's not a cellphone so I'm dubious.
22:58 🔗 DoubleJ "Web only". Jerks.
22:59 🔗 Coderjoe one of the local stores still had a small collection of components, last time I was in
22:59 🔗 Coderjoe nowhere near as extensive as when I was a kid, though
23:00 🔗 DoubleJ Oh no, not at all. I can't remember the last time I went into a Radio Shack and knew less than the person "helping" me.
23:00 🔗 DoubleJ I'll probably check on Saturday anyway, since there's one pretty much on my way to anywhere around here I'd want to go.
23:01 🔗 Coderjoe yeah... I say their slogan (is it still the current one) as "You've got questions? So do we."
23:01 🔗 DoubleJ I usually used "You've got questions, we've got cell phones."
23:02 🔗 chronomex haha
23:02 🔗 DoubleJ Though in fairness, back around 2003 or so when I needed a peizo buzzer for my old Civic (it wouldn't beep at you when you left your lights on) they did have the exact part I needed.
23:02 🔗 DFJustin they did give me a box of blank 5 1/4" floppies once because it wasn't even in inventory
23:03 🔗 DoubleJ Heh. I could imagine the current crop of them being confronted with 5.25s. "WHat kind of messed-up frisbee is this?"
23:04 🔗 DoubleJ Cool about the disks though. I sure wish I'd gotten a deal like that back when 3.5s were a buck a pop.
23:18 🔗 DFJustin it was such a bummer when aol switched to cd, no more free disks
23:19 🔗 DoubleJ I used a few that way too. Then started using the DVD-type cases they mailed the CDs in for a while.
23:19 🔗 DoubleJ Then they finally stopped sending them.
23:52 🔗 dashcloud hi folks, is GPT partition tables only for 3 TB or larger drives, or is there some benefit I would get over the standard partition table for formatting an external drive?
23:58 🔗 dashcloud wow- I've never seen wikis merge their spam users into one user before- usually they just get deleted & perma-banned
23:58 🔗 chronomex we're archivists.

irclogger-viewer