[00:29] episode 4x07 is up of hak5 [01:04] how's this look? http://pastebin.com/BwEgbDT1 [01:13] how about some line breaks [01:16] I was told one giant line of text is what was wanted, so that's how I re-formatted it [01:17] if you've got an example or a sample I can look at, that would be appreciated [01:17] Almost got it. [01:17] I don't literally mean HEADER: [01:18] I want it like this, no line reaks [01:18] But HEADER isn't needed, page numbers not needed. [01:27] how's this: http://pastebin.com/w85muBcN ? [01:33] Big K Magazine, April 1984. Contents. Games Programs: ROCKET for VIC 20, BOMB RUN for ORIC, DEMON DRIVER for COMMODORE 64, DOWN FALL for BBC Model 8, ESCAPE for SPECTRUM. SOFTWARE REVIEWS: Charlie Nicholas reviews for us. HARDWARE: Wonderful Widgets, Brilliant Bodges- A Cheapo Epro, Goad Your Code the 6502 Way, Squaring Up- Atari v. Acorn 91. FEATURES: Do you Sincerely Want to be Rich? ... [01:33] See what I did there? [01:33] okay- got it [01:34] I guess Games Programs should be GAMES PROGRAMS: [01:38] What about the free-standing bits at the end? [01:42] Don't go crazy [01:42] okay then- http://pastebin.com/FDBHL2MX [01:43] Yes. [01:43] proceed [01:44] move on to the next one? or did you want more stuff captured from the contents page? [01:45] I mean go on, do it all [01:45] The style is good [01:45] Do this issue, and move on to next issue, this will be good. [01:45] okay [02:16] did you also want the boilerplate at the bottom of the page? [02:23] No [02:26] okay- think I got the first one done then: http://pastebin.com/79nPpTLk [02:32] I showed your post about the pirate radio archive to a guy I know online who's into it, and he pointed me to http://radio-airchecks.nl/ , which he says has 500 GB at least of pirate radio recordings [02:42] just watch a clip of glenn beck on black tom explosion [02:42] http://en.wikipedia.org/wiki/Black_Tom_explosion [04:56] Well, even though it's taking a billion years, I am uploading those 80 Microcomputers. [05:20] Just blew in 61 issues of Commodore Format magazine, which was dedicated to the Commodore 64. [05:20] http://www.archive.org/details/commodore-format-magazine [05:20] Should be ready for perusal in an hour or so. [05:50] just thought I say hi and let you know I haven't forgotten you. One day... in some way... you will be repaid. Dishonesty is too nice a word for you... Have a happy holiday season... NOT. [05:50] My e-mail is awesome [05:50] This is a guy whose hard drive I still have [05:50] Some of you have noticed my slow turnaround [05:50] He turned abusive [05:51] Not surprisingly, I think you'll understand, his turnaround became slower [05:51] So we're now in xeno's paradox [05:54] ShetchCow: i readed about your distriwiki [05:54] couldn't that be done using a git/mecurinal like vcs [05:57] It could be done a ton of ways. [05:57] It'd be a module. [05:57] i think my linux source dvd will be of some use [05:58] i have full distro that will be able to rebuild it self [05:58] websites, sources tarballs ( recompressed to .tar.lzma to save space), and repos of projects [06:01] i also use dokuwiki cause slitaz doc website does [06:02] dokuwiki puts all docs in plan text :-D [06:02] the history is a problem when the users of changed history doesn't exist though [06:04] Not true [06:04] It means there has to be a shared update process [06:04] And it means you will have race conditions [06:05] And those are all problems that need fixing. [06:05] ok [06:06] but all changes should be able to be reverse anytime? [06:06] i only think git or mecurinal cause you can just branch off the master/default branch [06:26] not often you see SketchCow talking with himself [06:29] So.. this is the fifth day of my instructables.com mirroring [07:07] It' a big one. [07:07] indeed [07:19] It's growing slowly.. I'm up at 28GB now [07:20] that's 5-6GB/day [07:21] someone here had downloaded 40GB "once upon a time).. I'm atleast 1-3 days away from that [07:48] http://vimeo.com/28976327 [07:50] Ooh [07:52] Hm, I'd do either 6502 or Tape - but Arcade would be interesting as well, even though that's atleast a little covered [07:53] SketchCow: interesting video. I thought the tense music was odd though [07:53] I laughed at the tagline for the Tape documentary though, which is good :) [07:53] It added atmosphere [07:54] This whole thing is completely off-kilter. [07:55] yea, the atmosphere felt wrong somehow [07:56] i liked it [07:56] felt, human. [07:58] It is intentionally wrong. [07:58] You won't forget it soon, will you. [08:01] Ha, strategic [08:02] The whole thing is strategic. [08:02] It appeals to a certain kind of person. [08:02] A person who would give me hundreds of dollars and not see a thing for years. [08:03] That's not reddit people. [08:03] :) [08:03] It also gets weirder the more times you play it. [08:04] heh [08:06] Doesn't it. [08:07] It does. [08:09] You know those guys who make something filmy and then run around showing you their stuff and watching you and quizzing you on what you think? [08:09] I ain't one of those guys. [08:10] But I will say, I accounted for the liking two out of three. [08:10] You can invest with premiums in two [08:11] or you can invest in all three for slightly less than normal all three. [08:11] ah, I was mislead by the "beta" designation [08:12] Well, I like reactions. [08:12] But I don't seek it out. [08:13] Either people will invest, and I'll hit my goal, or they won't. [08:13] And then I merely have to archive forever [08:13] heh [08:14] Beta is merely my worry of it not rendering., [08:14] ah [08:15] so which of the three would you prefer to do? [08:16] All [08:16] otherwise he would have promoted only one :) (I think) [08:17] ha ha [08:17] Uploaded it to kickstarter page (preview) [08:17] I just love it [08:17] What a weird video [08:23] * db48x2 yawns [09:07] hrm [09:07] I can't find my book of stamps [10:09] I had it here somewhere not even a year ago [10:41] finally, my cap shall refresh at midnight! [10:42] heh [10:44] gah! i hate this stupid shake thing in windows7 [10:45] shake thing? [10:45] yeah, where you grab the window panel [10:45] you shake it left and right like twice and it minimizes all the windows except the one your dragging [10:47] oh, right [10:47] why do you hate it? [10:47] because i have 3 monitors, so, when i drag something across, it assumes im doing the shake thing [10:47] and minimizes everything when i like ot have it up where it is [10:48] huh, I don't have that problem with my three monitors [10:48] you using eyefinity though? [10:48] hmm, not at the moment [10:48] ahh, se eim using eyefinity [10:48] see im* [10:50] lol [10:50] I turned on eyefinity and it's got my monitors arranged vertically [10:52] you should be able to drga the monitors around on the plotter thing to put them right lol [10:52] no, I had to disable eyefinity and set it up again [10:52] O.o [10:52] then it let me choose between 1x3 and 3x1 [10:53] ahh yeah [10:53] ive done 5 monitor setups with eyefinity lol [10:54] ok, now I've got it set up "right" [10:54] awsome! [10:54] dragging windows around doesn't trigger the shake gesture though [10:54] even across monitor boundaries [10:54] for mine it does oddly enough [10:54] im not sure how i can turn it off [10:54] weird [10:55] though im using an XFX card and sometimes i wonder how shoddy the drivers are [10:56] already got a problem with my display port because of the cards bios and they refused to give me an updated bios [10:57] fun [10:57] well, I have to go back to my old settings [10:57] eyefinity is ok for games, but terrible for a normal windows desktop [10:57] and also my monitors are not all the same resolution [10:59] there, back to normal [11:00] well, except that all my windows are on the wrong displays :) [11:00] Arg, OCD overload [11:00] Windows are not allowed to move [11:00] :) [11:01] they should all be maximized [11:01] or at least almost all of them should be maximized [11:02] main screen has my main program maximised and my 2nd screen has IRC and whatever other chat windows laid out in a way where I can see most of them [11:27] SketchCow: nice shortened url in your pitch vid :P [11:28] the page doesn't appear to be up yet. are you proposing all three of those or letting people choose one? [11:29] because, fuck, I want all three [11:42] random question: how do you store your archives? warc? arc? tgz of directory? [11:43] I store them in a gigantic .derp [11:43] all files appended after each other [11:44] do you have a .herp file with the offsets [11:48] lol [11:48] No, that's not derpy or herpy at all [11:49] :( [11:49] just I notice from http://www.archiveteam.org/index.php?title=Wget_with_WARC_output you're using the old version of warctools [11:50] (which is well, unpleasant) [11:50] I just wget, without WARC [11:50] ah ok [11:50] it is just I am the person who is writing the new one [11:50] also, you're free to uphax the code for warc support [11:51] alard wrote that warc support a little while ago [11:51] well the problem is that we use python now instead of C, so it isn't as easily hacked into wget [11:52] but this is why I was asking about warc files [11:52] we who what? [11:52] oh [11:52] and are you saying WARC changes frequently? [11:52] I work for the company that wrote warc-tools (the c lib on google code) [11:53] I'll take a python WARC library [11:53] We no longer use or maintain it, and we're currently using a python library instead [11:53] ah-ha [11:53] sorry, yeah I should have owned up to that earlier [11:55] Cameron_D: I could probably knock up a wget like script that uses it [11:55] Cameron_D: we use it in production but heh, my attention has been on the bits that use the library rather than the library itself [11:55] Hi all. [11:55] but I have time alloted to deal with support issues for it [11:55] http://code.hanzoarchives.com/warc-tools/overview [11:55] o/ alard [11:55] tef: yes, wget-warc uses the old c version (which seems to work pretty well). [11:59] meh, as long as it doesn't produce unusable WARC archives [11:59] not so far [11:59] (heh) [11:59] but it turns out lots of warcs are a bit special [12:00] oh? [12:00] I found one with unix line separators and gzipped fully rather than crlf and each record gzipped [12:00] and there are a bunch of pre 1.0 ones floating around [12:00] that's always fun [12:01] tef, cant look through the code at the moment, but is there a usage example of sorts? [12:01] Cameron_D: there are some scripts in the repo for opening/reading warcs and arc2warc conversion [12:02] apologies for the lack of documentation. we're a small company and we're a little rushed off our feet at the moment [12:03] it's ok [12:03] i'll see if I can merge in a python wget example to it [12:03] I have some code knocking around for that that doesn't use wget https://github.com/tef/codesamples/tree/master/pyget [12:04] Mmmmh, 8.4GB memory usage from wget [12:04] doesn't use warctools either [12:04] tef, thanks, I'll take a look [12:04] Cameron_D: if you have any questions about warctools email me directly at thomas.figg@hanzoarchives.com [12:05] hmm, i got a question about Wget actually [12:05] I have *some* time alloted to deal with support/features [12:05] about wget, or wget and warc? [12:05] just wget on it's own [12:05] what i want to know is [12:05] say if im poking a url, for example www.example.com/millenium/0001/ and it has a number heirarchy for directories right [12:06] Fortunecity? :) [12:06] is there a script i can use to incrementally increase the number to a specified limit and stop when it hits the number or? [12:06] yeah [12:06] bash [12:06] bash would be your friend, yes [12:07] www.example.com/millenium/{0001..9999}/ should do the trick. it'll be longer than the maximum command line length [12:07] SketchCow: that last shot in the kickstarter vid is kinda awkward [12:07] for i in `seq ... ...`; do ....; done [12:07] db48x2: cheers mate (: [12:09] yw [12:09] ersi: btw, which features of wget are the most useful to you? [12:10] a compress option i think is need for wget-warc [12:10] (my boss is happy for me to make a simplified wget example for the new warctools) [12:10] only cause right now it only saves to as gzip/tar.gz [12:11] godane: there is one [12:11] per record compression ? [12:11] tef: the regular mirror switch, convert links to local ones and keep original (-kK) http/ftp support [12:11] cool [12:11] godane: --no-warc-compression [12:11] --content-disposition is crucial for me [12:12] --random-wait -EkKp --protocol-directories -np --follow-ftp [12:12] was talking about about changing the it to bz2 or .lzma [12:13] gzip is pretty de-facto for warcs [12:13] ok [12:13] oh, and --user-agent, but that's easy to do [12:14] thought it would be nice to add lzma so you can save more space [12:14] this is really helpful [12:15] ersi: most stuff does re-writing after creating warc files [12:15] the idea being the warc record being an exact snapshot of the wire traffic [12:15] (near enough) [12:15] tef: honestly I think it'd be easier to integrate the python library into wget [12:15] tef: that reminds me [12:15] hmm [12:16] tef: alard was working on a way to feed a warc into wget and have wget output the set of mirrored directories [12:16] ah I see [12:16] warc unpacker [12:17] yea [12:17] it could do the -k (and -K) stuff [12:17] well [12:17] in a warc record [12:17] you can have request/response [12:18] as well as conversion records [12:18] yea [12:18] so the -K stuff would be writing those in some fashion [12:18] I had wanted to put conversion records into the warc [12:18] it's a bit tricky, so I haven't done it yet [12:18] (we strip transfer-chunked and content-encoding) [12:21] it seems most of the wget options you guys use are about unpacking/rewriting the content i.e -EkK [12:21] and a few for navigation i.e -p -np --follow-ftp [12:23] so there is less need to clone wget if wget can read from warcs via some method (i.e a proxy) [12:23] and generally wget's link traversing (mirroring) [12:26] i'm not sure what that meansin specific - do you mean keeping an existing archive up to date? [12:27] or do you mean the scope of links that are checked [12:27] the scope of links fetched [12:27] ah [12:27] yea, wget does a lot of parsing of html and css [12:27] it seems to do the job well of going deeper [12:28] and when it's getting page recreciuits it's awesome [12:28] oops, what the hell happened there [12:28] page requisits(sp?) [12:28] to some extent I think it would be better trying to play to the strengths of being written in python - hackable, rather than apeing wget entirely [12:28] i.e scriptable for those more awkward things rather than having to resort to bash :-) [12:29] Nothing wrong with blunt tools, even though I like python [12:29] ;D [12:29] yeah but you have a very good blunt tool [12:30] ersi: thanks again for taking the time to explain this stuff [12:30] fwiw both I and my boss have a soft spot for the work archive team does so we'd like to help out where we can, esp re warctools [12:31] [12:34] anyway, i'll shut up now and i'll talk again when i've got something to show for it [12:34] cheers [12:41] heh [12:41] you can talk whenever you like [12:45] tef: no prob of course :) [12:45] db48x2: I hate that thing where someone comes in with a driveby idea [12:46] :) [12:46] well, it's different from getting feedback when one is more likely to do something about it [12:46] and most of us here have a softspot for IA, so WARC is closeby in our hearts [12:46] even if they... care about robots.txts [12:47] (I see why though, sucks getting blocked or taken unseriously) [12:47] Booya! One more bug/defect reported~ then I'll look extra productive [13:42] http://nationalmap.gov/historical/ [14:30] Schbirid: sweet [14:36] Bluh, ..instructables.. /keyword-iphone/keyword-easy/index.html [14:40] Schbirid: are you able to download any maps from that? [14:41] the links in the 'Download GeoPDF' column all just point back to the search results [14:42] the map search works though [14:43] Morning. [14:44] hello SketchCow [14:45] sorry didnt try [14:45] downloading maps is very slow [14:45] 25kBps :) [14:46] it was announced today so surely a lot of traffic [14:46] yea [16:05] Hey, so I wrote the 80 micro archive that has some "offline" and asked for them [16:05] Less than 24 hours later, here they come. [16:14] http://www.abandonware-magazines.org/index.php [16:16] does anyone have a link to a pdf on an https server? [16:16] I have to test specifically that combination [16:16] That's quite a link, DFJustin [16:31] underscor: today would be a good day to fix your olduse.net shell box. (on Boing Boing) [16:36] underscor: oh, it works again, NM [18:08] http://kck.st/jasonscott [18:15] cool, will brute force spread :) [18:17] big jump from $10 to $100 [18:19] Yes. [18:24] Will these each be shorter/less comprehensive than BBS/Getlamp? [18:28] No. [18:29] wow [18:42] fucking scanner is misbehaving [18:42] today is an angry technology day [18:42] actually it's an angry chronomex day [18:43] SketchCow: radio silence, btw [18:45] grr technology [18:47] ALexis got very sick [18:47] She's in bed since Friday [18:48] oh dear [18:48] send her my regards [18:49] So you're not being ignored. [18:52] ok [18:54] I hope "RISE OF THE METADATA WARRIOR" will be recorded; the title is awesome [18:56] Who is ALexis ? [18:57] SketchCow's boss at IA [18:58] - Create a public, museum-like archive of 3D Porch in about 30 days. This museum will probably just have 500 or so photos. [18:58] damn [18:58] hello everybody [18:58] i dunno if it's significant enough for you guys but I'm just throwing it out there [18:58] it's a site where you can upload 3D photos [18:58] it's not very big/popular and I don't think it's been around very long [18:58] so uh, there's this site called 3D Porch which might be shutting down http://3dporch.com/ [18:58] 50,000 items, 7 files per [18:58] :| [18:58] I wonder how much it costs [18:59] Someone please grab it [19:00] gonna try to get in touch with him [19:00] grab first if it's that small [19:04] man some people do NOT know how to use 3D cameras [19:26] wish I could do higher than "project backer" [19:27] I'm surprised tape is more popular than arcade at the moment [19:28] i am glad :) [19:28] I'm on that one [19:30] It's a good and interesting way to get opinion. :) [19:31] Ok, off to broadway [19:31] seeing a musical for my birthday [19:31] happy birthday! have fun [19:40] I'd like Tape over Arcade as well [19:44] well, buy it now! [19:44] receive it in 4 years or so :) [19:46] There are 46425 photos on 3dporch, I think (the photos on the 'popular' lists). I have a list of the photo ids, downloading them now, unless someone else is doing that too. [19:47] each one is 7 files [19:47] + metadata if you care [19:47] 7? I have 6. [19:48] .jps [19:48] .left.jpg [19:48] .mpo [19:48] .redcyan.jpg [19:48] .right.jpg [19:48] .wiggle.gif [19:48] missing .sbs [19:48] oh nm [19:48] those just rearrange left and right [19:48] you're right [19:49] I am missing the wiggle.thumb [19:49] what kind of speed are you getting [19:49] 4MB/s [19:49] nice. should be quick work [19:49] It's amazon, so as fast as I can. [19:51] how did you get all of the IDs? [19:52] I grabbed the 'popular' pages and extracted the IDs. [19:53] So I only have the popular page stuff, but maybe that's all there is? [19:53] Everything is popular? [19:56] you can go to each type of camera [19:56] and cross check [19:56] hmm [19:56] looks like IDs are just [0-9a-z]{4} [19:56] caps also but yes [19:57] hahah [19:57] http://3dporch.com/b6gp [19:58] i have not yet encountered an uppercase letter in the ids [19:58] alard: I really wish I knew how to do all that :| [19:59] Coderjoe: go to nintendo 3ds there are a bunch on the first page [19:59] ah [19:59] i think my favorite is this one http://3dporch.com/4gro [20:00] [0-9A-Za-z]{4} is enough for 14.8M IDs [20:01] not seeing much in the way of metadata [20:01] hey, author replied [20:01] er, owner [20:01] I am really honored that you would pick my site to archive. [20:01] Right now all the images are hosted on S3, which doesn't provide a convenient gunzip tool. The total data is about 50GB. [20:01]