#archiveteam 2011-09-09,Fri

↑back Search

Time Nickname Message
03:01 🔗 josephhol Anyone seen a howto for backing up your life?
03:03 🔗 josephhol especially keeping everything sane so your next-of-kin can actually make sense of it?
03:15 🔗 perfinion there are services that will email next of kin when you die?
03:28 🔗 josephhol perfinion: sure. I've got a dead man's switch. more trying to figure out how to build a personal archive that my grandkids can use.
03:29 🔗 perfinion oh
03:29 🔗 josephhol but I found this, so I feel bad for not searching first: http://www.archive.org/details/personalarchiveconf
03:29 🔗 perfinion i'd write a long letter and then GPG it and give someone the passphrase and key
03:30 🔗 josephhol oh, do you guys have any use for a coder with mad system automation skills?
03:30 🔗 perfinion always
03:31 🔗 perfinion ive been busy recently so not sure what the current projects are
03:31 🔗 josephhol got a bug tracker I can bang my head against?
03:31 🔗 perfinion but we always need more ppl
03:31 🔗 perfinion no, no butracker
03:31 🔗 perfinion its mainly done in different channels for each project
03:33 🔗 josephhol well the only relevant experience I've got is my wikileaks mirror and years of grooming my backups
03:34 🔗 josephhol but I've got a weird interest in usenet and mailing lists, if there are any relevant projects
04:01 🔗 chronomex excellent
04:01 🔗 chronomex I'd like to have a copy of google groups in mbox format.
04:01 🔗 chronomex been a dream for a little while
04:01 🔗 chronomex just throwing that out
04:01 🔗 Coderjoe which "google groups"?
04:02 🔗 Coderjoe there are 3 things under that name
04:02 🔗 chronomex right.
04:02 🔗 chronomex 1) rescue the usenet archives gap from google
04:02 🔗 chronomex (the bits between where the early dump ends, and what you can get on retention right now.)
04:03 🔗 chronomex 2) archive mailing list histories
04:03 🔗 chronomex 3) what's the 3rd thing?
04:03 🔗 Coderjoe what providers has retention been checked on?
04:03 🔗 Coderjoe (and which groups)
04:03 🔗 chronomex I haven't looked very much at all
04:03 🔗 closure yeah, the gap is about 1991-2005 from what people tell me about what's available on giganews
04:03 🔗 chronomex right
04:03 🔗 chronomex sounds about right
04:04 🔗 Coderjoe iirc, there was some forums thing under the groups name as well
04:04 🔗 chronomex hmm.
04:04 🔗 chronomex google goops
04:04 🔗 closure oh good, PDA2011, was looking forward to watching those talks
04:05 🔗 Coderjoe pick a group and I'll check retention on a few proviers
04:05 🔗 closure wish the IA made it easier to get all the urls of every item in a collection
04:05 🔗 chronomex Coderjoe: I usually use comp.dcom.telecom
04:05 🔗 chronomex Coderjoe: that's been continuously active since forever, so it's a good canary
04:12 🔗 Coderjoe oldest I'm currently seeing (though I am not sure if this program is showing me everything) is
04:12 🔗 Coderjoe Date: Mon, 9 Jul 2007 00:04:59 EDT
04:13 🔗 chronomex what program are you using?
04:14 🔗 Coderjoe the wrong one for the job, since it is primarily a binary downloader
04:14 🔗 chronomex ayup
04:14 🔗 Coderjoe i'm about to connect in and talk to the servers manually
04:17 🔗 Coderjoe astraweb: Date: Sun, 21 Sep 2008 10:16:24 -0400 (EDT)
04:18 🔗 Coderjoe newshosting: Date: Tue, 8 Sep 2009 13:26:28 -0400 (EDT)
04:20 🔗 Coderjoe giganews: Date: Mon, 23 Jun 2003 11:17:25 -0400
04:21 🔗 Coderjoe easynews: Date: Tue, 8 Sep 2009 13:26:28 -0400 (EDT)
04:22 🔗 chronomex giganews is the winner in text retention I see
04:22 🔗 chronomex (so far)
04:22 🔗 Coderjoe at least that is the oldest one given to me at the group command response
04:22 🔗 Coderjoe and I'm done
04:22 🔗 chronomex aye
04:23 🔗 Coderjoe some providers might be doing a fake lower retention due to the tier the account is on
04:25 🔗 chronomex yeah, I've seen that
04:25 🔗 chronomex or heard of it at least
04:25 🔗 Coderjoe the funny thing is that usually, if you ask for an article by articleID, it still gives it to you, even if it is older than your date limit
04:26 🔗 chronomex hmm
04:26 🔗 Coderjoe so if you have really old headers or nzbs or such, you can still get the older posts
04:26 🔗 chronomex or replies
04:30 🔗 Coderjoe yeah... I think that newshosting group response is fake..
04:31 🔗 Coderjoe because that 2007 message I had earlier has newshosting as the last path element
04:31 🔗 Coderjoe i'll grab the article ID from the giganews one and see if I can get it from others
04:34 🔗 Coderjoe Message-ID: <telecom22.533.1@telecom-digest.org>
04:36 🔗 Coderjoe astraweb: no such article
04:42 🔗 Coderjoe easynews: no such article
04:43 🔗 Coderjoe newshosting: no such article
04:44 🔗 Coderjoe and I double-checked at giganews to make sure I didn't screw up the command
04:46 🔗 Coderjoe i wonder if any schools still run usenet servers, and what retention they might have
04:50 🔗 chronomex my school quit two years ago. they had kind of shit retention.
05:01 🔗 SketchCow http://blog.longnow.org/2011/09/08/the-archive-team/
05:01 🔗 SketchCow That is a HELL of an endorsement
05:01 🔗 Coderjoe oh
05:02 🔗 Coderjoe I have a bunch of shit I pulled down from divx' stage6 site before they went down
05:02 🔗 Coderjoe my stats from it are here: http://wegetsignal.org/stage6.php
05:03 🔗 chronomex whoa, long now?!?
05:04 🔗 Coderjoe (I don't think I had heard of archiveteam at that point)
05:05 🔗 Coderjoe I had scripts all written up and a centralized database and all that, and had 3 different systems on different networks doing the downloading work
05:05 🔗 Coderjoe tunneling mysql connections across ssh to talk to the db
05:05 🔗 chronomex nice
05:08 🔗 Coderjoe one tool just pulled video IDs from search result listing pages (and other similar listing pages)
05:08 🔗 Coderjoe another tool scraped metadata
05:09 🔗 Coderjoe and a third actually fetched the video file
05:34 🔗 SketchCow I just had someone *EXPLODE* at me over e-mail
05:35 🔗 SketchCow They took the absolute, complete, total misreading of the short thing I said in return to an excellent work they're doing. I mean, the absolute worst.
05:35 🔗 SketchCow And then went from there on a massive rant tear, up to and including telling me to step aside for my total disrespect and insult to their abilities, goals, and issues.
05:36 🔗 chronomex SketchCow: scripts, collection.
05:36 🔗 SketchCow I'm going to connect you with my boss and move it forward, OK
05:36 🔗 chronomex k
05:50 🔗 Wyatt Damn, that's the worst sort of misunderstanding.
07:06 🔗 SketchCow Yeah, I had to punt and send the mail to my superior as a do-over
07:06 🔗 SketchCow I withdrew
07:06 🔗 SketchCow There's no going back
07:15 🔗 ersi Uh, haha - wget is taking up 1.5GB memory
07:16 🔗 ersi SketchCow: Ouch.
07:17 🔗 SketchCow Even I know when I am beeting
07:17 🔗 SketchCow And beeting
07:17 🔗 SketchCow beaten
07:17 🔗 SketchCow When someone takes "Don't worry about this" to mean "You are incapable of understanding this", there's nowhere to go
07:17 🔗 chronomex wwwwwww
07:18 🔗 SketchCow It's like someone screaming at you for offering to take out the garbage, because you just implied they're incapable of it
07:18 🔗 chronomex some people are too fragile
07:18 🔗 ersi That's the sign of the "Abort mission!" Abort Abort!"
07:18 🔗 SketchCow He wants assurance that grabbing the magnetic image off a cray disk will be legally protected.
07:18 🔗 SketchCow Amount IA will discuss this: 0
07:19 🔗 SketchCow We're just not qualified and we don't have anyone qualified to
07:19 🔗 ersi I'd suggest him to call a whambulance
07:19 🔗 SketchCow But he wants to know he has some sort of gold medal saying he can do whatever
07:19 🔗 SketchCow He's worried cray will sue
07:20 🔗 SketchCow Coward
07:20 🔗 chronomex I bet cray is kind of quietly happy about this
07:20 🔗 Coderjoe ... over something on some really oudated hard drive?
07:20 🔗 SketchCow Yes
07:20 🔗 ersi It's free fuckin' PR
07:20 🔗 SketchCow Anyway, I punted, sent it to my superior, I am done with it
07:20 🔗 SketchCow I like to help, but not with divas
07:20 🔗 ersi All the geeks and nerds go "Jizzpants!" hearing about it
07:21 🔗 ersi beside the few who say "Cray who?"
07:21 🔗 Coderjoe it would be awesome if someone with hardware and filesystem knowlege would come forward and help
07:21 🔗 ersi I think I should have ran wget on another machine than my work machine >_>
07:22 🔗 chronomex that's what --continue is for
07:23 🔗 Coderjoe not sure --continue works well with the options to modify links in downloaded files
07:23 🔗 ersi I'll just let it run, but it's gonna be a bit laggy :D
07:23 🔗 Coderjoe that and the list of links to visit are the only things I can think of that would cause wget to eat so much ram
07:24 🔗 SketchCow Anyway, I was trying to give this nancyboy an out.
07:24 🔗 Coderjoe tar: memory exhausted
07:24 🔗 Coderjoe on a system with 4G of ram and very little else running
07:24 🔗 ersi Coderjoe: Party
07:24 🔗 ersi No swap?
07:25 🔗 Coderjoe i didn't have swap at that time. I just added 4G of swap and am trying again
07:25 🔗 ersi ah
07:25 🔗 Coderjoe 4G of ram? who needs swap? :D
07:25 🔗 ersi I got a little swap, even though I got 4GB memory
07:26 🔗 ersi I'll never let the installer choose again though, last machine I installed it auto sat 12GB as swap
07:26 🔗 ersi fuckin retarded
07:26 🔗 Coderjoe i ran into another case where a broken autotools config caused automake to forkbomb and consume vast amounts of ram
07:56 🔗 Coderjoe mmm
07:57 🔗 Coderjoe expected raw tar file size: 1452086937600 bytes
08:25 🔗 Coderjoe and tar is up to 7.5G without having output anything yet
08:26 🔗 chronomex that's kind of a big tarfile
08:26 🔗 chronomex how are you making it?
08:26 🔗 Coderjoe how do you mean, how am I making it?
08:27 🔗 chronomex paste your tar commandline?
08:27 🔗 chronomex tar shouldn't be using that much ram
08:27 🔗 Coderjoe tar cf - --numeric-owner --no-recursion --totals -T 14346.LRWBU
08:27 🔗 Coderjoe and the -T file is 4G
08:27 🔗 chronomex ah
08:27 🔗 chronomex nevermind
09:16 🔗 Coderjoe sweet
09:16 🔗 Coderjoe 4GB isn't enough swap
09:16 🔗 ersi Heh
09:35 🔗 inv josephhol: shamir's secret sharing algorithm - check it out
09:40 🔗 josephwdy Good morning, how is everyone doing this fine day ?
12:37 🔗 ersi Umm.. I up at 1.8GB RAM on wget
12:37 🔗 ersi I fear coming back to a dead machine on monday :D
12:47 🔗 SpaceCore ersi: whee!
12:48 🔗 kin37ik hullo
12:54 🔗 ersi Or maybe my disks will be full of instructables >_>
12:55 🔗 * kin37ik goes back to poking fortunecity
12:59 🔗 ersi Hm, lots of small files - I'm "only" at 7.3GB instructables so far
12:59 🔗 kin37ik have any estimate as to how big instructable sis?
13:00 🔗 kin37ik instructables is*
13:00 🔗 ersi I have no idea what so ever
13:03 🔗 kin37ik hmm
13:04 🔗 ersi We'll see.
13:14 🔗 Schbirid http://www.archiveteam.org/index.php?title=Projects#Dead_Projects "EmuWiki.com Complete Emulators Collection v0.2 [All platforms]" is at underground-gamer.com
13:17 🔗 kin37ik is it a package or are they using the actual wiki but updated? if it's a package itll be worth my time grabbing it
13:19 🔗 Schbirid it' a 13gb torrent
13:20 🔗 kin37ik link?
13:20 🔗 Schbirid http://www.underground-gamer.com/details.php?id=40311
13:21 🔗 Schbirid you would need an account though
13:21 🔗 kin37ik i cant sign up
13:21 🔗 kin37ik max user acc's
13:21 🔗 kin37ik ill keep that link bookmarked though
13:22 🔗 Schbirid http://pastebin.com/1Rkn4Ev4
13:22 🔗 Schbirid nah, if you want i will grab and give you a http link
13:24 🔗 kin37ik i just need the torrent file really
13:24 🔗 kin37ik unless you got an invite to UG?
13:25 🔗 Cameron_D magnet link me
13:26 🔗 Schbirid the torrent would run under my account
13:26 🔗 Schbirid which is a no-no;)
13:27 🔗 Cameron_D But the magnet link is an ID to the torrent and can't be identified with you
13:28 🔗 Schbirid it an account based torrent tracker, this would not work
13:29 🔗 kin37ik demonoid.me is acocunt based, and still
13:29 🔗 kin37ik anyone can just grab whatever and it wont log under that persons account
13:29 🔗 Cameron_D magnet link = DHT
13:30 🔗 Schbirid i cant believe you guys do not know how these work :P
13:31 🔗 Cameron_D throw me the magnet link and we shall see
13:31 🔗 ersi kin37ik: Have you never used a real private tracker?
13:31 🔗 ersi Aw man
13:31 🔗 kin37ik uhm
13:31 🔗 kin37ik dont know lol
13:32 🔗 ersi Heard about ratios?
13:32 🔗 kin37ik yep
13:32 🔗 ersi Private trackers are strict about only ONE user using each account, and keeping ratios good
13:32 🔗 kin37ik mmmm S:
13:33 🔗 Cowering I think I have 10T credits on UG :)
13:33 🔗 ersi So you either get an account and download it from the members, or have a member download it for you - putting it somewhere :P
13:33 🔗 kin37ik well i cant create an account till the acocunts are pruned, according to the site
13:33 🔗 ersi If you get an invite from a current member, that's another way in usually
13:34 🔗 Schbirid and those torrents are non public, no dht
13:34 🔗 Cowering blame it on the retro mafia
13:35 🔗 Cowering anyone know what 'internal error DISC0272' on an HP Touchpad is? Seems I can't archive EVERY app before HP kills it off
13:35 🔗 kin37ik google it?
13:36 🔗 Cowering no hits
13:36 🔗 Schbirid that unit is bad, send it to me
13:36 🔗 kin37ik what? cant be right
13:37 🔗 Cowering i'm well over 1000 games downloaded, and many things fail at that level
13:37 🔗 Cowering quickoffice takes 4 minutes to load, since itwants to index every little .txt file in every app folder.. which is just wrong
13:38 🔗 Cowering can't seem to find where HP stored the first two issues of their online 'Pivot' mag either.. so those might be gone forever
13:38 🔗 kin37ik then do a system search for them?
13:38 🔗 Cowering sept issue has 3 free app promos hidden on page 26 in case anyone cares :)
13:39 🔗 Cowering i'm stupid, somehow i can't get TP to see my genned SSH keys, so no shell, thus no 'find'
13:39 🔗 Cowering these older apps will be useful later.. all the newer 'updates' have phone home crap in them to enable adware
13:41 🔗 Cowering only had 2 hard failures so far.. the KQED radio app and some game called 'J@cker'
13:42 🔗 kin37ik guess no-clobber doesnt like me today...
13:45 🔗 kin37ik and im still trying to peice together fortunecity's directory structure so i can poke it a bit more efficiently
13:50 🔗 alard kin37ik: I've got a list of 400,000 fortunecity urls if you want them.
13:50 🔗 kin37ik alard: cheers, send them my way (:
13:50 🔗 alard (I've been playing with Google a little bit.)
13:51 🔗 kin37ik alard: awsome, send them my way, pastebin or something?
13:52 🔗 kin37ik id better move wget to my secondary drive lol
13:52 🔗 alard I'll have a look. First I've got to get them out of Redis.
13:52 🔗 kin37ik no worries
13:52 🔗 Schbirid are there google scrapers for result pages?
13:53 🔗 alard I've written my own scrape that asks Redis for a word (from a set of dictionary words), searches on Google and extracts the urls, adds them to another set on Redis.
13:54 🔗 kin37ik awsome
13:56 🔗 kin37ik geez i wish my nos ewould stop running like a tap and making my face burn when i want to sneez
14:07 🔗 kin37ik uh oh
14:09 🔗 Schbirid nice
14:09 🔗 kin37ik woah
14:10 🔗 kin37ik that is alot of urls
14:10 🔗 kin37ik i have alot of poking to do this week
14:15 🔗 kin37ik ahaaa, so they did keep those sites
14:15 🔗 kin37ik alard: you sir, are a legend!
14:46 🔗 blue_ that blackout was strange times :/ hope i didn't miss anything
14:48 🔗 kin37ik is there certain guidelines you have to follow when adding to a page on the archive team wiki?
15:03 🔗 kin37ik well, this is an interesting find
15:04 🔗 alard kin37ik: Not that I know of. You just add a page and if it looks like spam it will be blocked later.
15:04 🔗 kin37ik alard: eh?
15:04 🔗 alard Perhaps it's useful to copy the project panel from another project's page, if it is about a project. E.g. http://www.archiveteam.org/index.php?title=MobileMe
15:04 🔗 kin37ik ahh
15:04 🔗 alard The wiki.
15:05 🔗 kin37ik ahh right
15:05 🔗 kin37ik head was in a different place
15:09 🔗 kin37ik was thinking a bit too much on these url's i think
15:11 🔗 alard Ah, I see.
15:12 🔗 kin37ik yeah, i did a little bit of googling wiht some urls
15:12 🔗 kin37ik turns out that pages from the original website back from 96 and 97 still exist
15:12 🔗 kin37ik though half of them return 404's i presume from people either buying a domain or just wiping out the contents
15:40 🔗 kin37ik goddamn
15:53 🔗 kin37ik enough poking for tonight, time for bed, laters
17:21 🔗 SketchCow G'morning
17:31 🔗 sep332 hey sketchcow, get my email? just checking
17:31 🔗 SketchCow Yes
17:31 🔗 SketchCow Shoving through things today
17:31 🔗 sep332 ok
17:34 🔗 SketchCow Today I set up an MRTG graph. I haven't done that in years.
17:35 🔗 SketchCow Probably since 2001.
17:35 🔗 SketchCow http://batcave.textfiles.com/ocrcount/
18:06 🔗 Schbirid SketchCow: watched your defcon talk earlier, you keep being an inspiration!
18:07 🔗 SketchCow I like talking!
18:08 🔗 Schbirid I like watching you talk!
18:09 🔗 SketchCow I have a presentation on the 30th at Derbycon
18:15 🔗 db48x huh
18:15 🔗 db48x emacs actually crashed
18:44 🔗 josephwdy SketchCow: ?
18:45 🔗 josephwdy I having been playing with khanacedemy for a little bit, do you have any ideas on how to teach history ?
18:45 🔗 Schbirid lesson 1: you will be taught the history of the winners
18:46 🔗 SketchCow lesson 2: The winners are assholes
18:46 🔗 SketchCow This is quite a question.
19:14 🔗 josephwdy Well have you ever thought about it ?
19:16 🔗 SketchCow I teach people a lot and certainly create ways to teach people. Khan Academy is just another platform, one for video, that partially takes video from other sources and repackages it.
19:19 🔗 josephwdy What do you mean by "takes video from other sources", do you mean when he does a series based off practice test stuff(gmat, sat, etc) ?
19:24 🔗 SketchCow http://www.khanacademy.org/video/salman-khan-talk-at-ted-2011--from-ted-com?playlist=Khan%20Academy-Related%20Talks%20and%20Interviews
19:24 🔗 SketchCow I mean like he takes TED videos, and puts them up.
19:24 🔗 Schbirid "Given that the mean length of a year is 365.2425 days, Office 365 only needs to maintain 99.93% uptime to stay true to its name."
19:25 🔗 emijrp Is that Bill Gates?
19:25 🔗 SketchCow Likely.
19:25 🔗 josephwdy yeah, he talks with Sal at the end of his talk.
19:26 🔗 josephwdy I do find it strange he has playlist of all his talks and media related stuff on the khan academy site.
20:56 🔗 sep332 what's with the topic? someone mess with media mail?
21:00 🔗 Aranje <3 media mail
21:00 🔗 Aranje I guess here is the perfect place to ask
21:00 🔗 Aranje Is there a good way to diff two directories?
21:00 🔗 Aranje two very large directories, that is
21:01 🔗 Aranje I know there is atleast 65% commonality between them and I want to deduplicate
21:01 🔗 sep332 maybe just rsync?
21:01 🔗 Aranje by hand if I must, by automation if I can
21:01 🔗 Aranje eh?
21:01 🔗 chronomex rsync from one into the other, delete original
21:02 🔗 Aranje mmm
21:02 🔗 Aranje but
21:02 🔗 Aranje hmm
21:02 🔗 Aranje I'm not sure that accomplishes what I want
21:02 🔗 chronomex I'm not sure what you want
21:02 🔗 Aranje ideally, I'd be moving things that weren't duplicated into its own folder
21:03 🔗 chronomex you could probably has some script out using md5sum and symlinks
21:03 🔗 Aranje so I'd end up with: 1 folder with duplicated content and 1 folder with content only present in one of the previous two folders
21:05 🔗 sep332 copy them together with a tool that auto-renames duplicates, like "name-2"
21:05 🔗 sep332 then search for the "*-2" and move them somewhere?
21:05 🔗 DFJustin rsync --dry-run maybe
21:06 🔗 chronomex ^
21:06 🔗 Aranje I guess I'll have to figure out how to use rsync then :>
21:06 🔗 DFJustin protip: back up data before learning how to use rsync
21:07 🔗 Aranje haha alright
21:08 🔗 Aranje I've got a backup of the smaller folder, but the larger of the two has only a single copy
21:09 🔗 sep332 of course, the bigger dataset is easier to lose :)
21:09 🔗 Aranje yup
21:14 🔗 Aranje -c will be useful
21:35 🔗 Aranje ohhh yeahhhhhh, rsync is my friday night
21:37 🔗 db48x yea, rsync rocks
21:42 🔗 db48x it's surprising how clever the algorithm is
21:42 🔗 db48x I was reading about it a few weeks ago
22:20 🔗 human39 hey all. I'm working on my mirroring script. Wonder if anybody has some feedback on it. https://github.com/human39/scruffy/blob/master/scruffy.pl (feel free to fork, muck and push!)
22:51 🔗 Coderjoe human39: http://twitter.com/geekmire/status/18108572789379072 http://twitter.com/geekmire/status/18216874642767872 http://twitter.com/geekmire/status/80495403572789248

irclogger-viewer