#archiveteam 2012-12-03,Mon

↑back Search

Time Nickname Message
00:05 🔗 SketchCow poor city pf heroes
00:24 🔗 balrog_ http://www.glitch.com/closing
00:25 🔗 balrog_ shutting down in 7 days...
04:55 🔗 tuankiet Why in the items/hour box, I see two black and blue line?
09:40 🔗 Nemo_bis SketchCow: the most downloaded TGM is the one with empty ISO :p https://archive.org/details/cdrom-gamesmachinedvd-volume-14
09:43 🔗 chronomex paging dr godane
09:50 🔗 SketchCow That is a mess, huh.
10:01 🔗 godane hey all
10:01 🔗 godane going to bad
10:02 🔗 chronomex nite
10:06 🔗 godane i'm uploading some nintendo 64 promo tapes
10:06 🔗 godane nite
12:21 🔗 SketchCow Let me see about shoving more crap off of FOS into the world.
12:23 🔗 ersi Data wants to be free!
12:32 🔗 SketchCow 53 CD-ROMs about to dead drop.
12:32 🔗 SketchCow I have a program called bchunk that converts a .bin/.cue to an .iso
12:32 🔗 SketchCow About to run it on all these poor things.
12:39 🔗 SketchCow And after that, assuming it works, I'd like us to go through the cdrom collection and find all bin/cue and I can write something that grabs the bin/cue, makes an iso, and uploads the iso.
12:55 🔗 norbert79 SketchCow: Are you sure converting them to ISO insteads of MDF is a good choice? ISO is for one tracked images only, what if they have more tracks?
12:57 🔗 SketchCow Then it makes more.
12:58 🔗 SketchCow Regardless, the reason to convert them to .iso (AS WELL, I include the original .bin/cue) is that the archive.org viewer works with iso.
12:58 🔗 SketchCow I have both the original AND the convert.
13:01 🔗 SketchCow http://archive.org/details/gambler_cdrom_01 (First one!)
13:02 🔗 SketchCow http://ia601500.us.archive.org/isoview.php?iso=/20/items/gambler_cdrom_01/GAMBLER_01_1996_0901.iso therefore works.
13:02 🔗 SketchCow Now, please allow me to do this 53 more times.
13:09 🔗 SketchCow http://archive.org/details/cdrom_gambler
13:09 🔗 SketchCow Aaaand, they're appearing!
13:09 🔗 SketchCow Not bad, only takes a few minutes.
13:26 🔗 void__ SketchCow: the code of that isoview.php is free? is public somewhere?
13:29 🔗 SketchCow I assume so, no idea where though.
13:29 🔗 SketchCow It's just using tar.
13:31 🔗 void__ ok
13:48 🔗 SketchCow Hey, want to hear a pet peeve?
13:49 🔗 SketchCow When someone criticizes my shit, and puts a :) at the end.
13:49 🔗 SketchCow Want. to. dragon. punch.
14:07 🔗 underscor void__: It's basically just formatting the output of 7z l
14:07 🔗 underscor The code isn't very exciting, and is nearly all just archive-specific header spewing, etc
14:09 🔗 void__ underscor: 7z called with a system() or equivalent?
14:09 🔗 underscor Yes
14:09 🔗 void__ ok, thank you
14:09 🔗 underscor We wrap system calls in a bunch of layers of abstraction, but that's what ends up happening at the very end
15:13 🔗 DFJustin SketchCow: is https://archive.org/details/gambler_cdrom_32b supposed to have 10 cds on it
15:18 🔗 SketchCow They're tracks.
15:18 🔗 SketchCow This is what BCHUNK does.
15:21 🔗 DFJustin oic
15:26 🔗 DFJustin underscor: any chance of using lsar/unar instead of 7z? there are currently a lot of archives that confuse it for one reason or another
15:34 🔗 DFJustin and they understand .bin without needing all this tomfoolery
15:48 🔗 SketchCow I asked about this. There was a Reason.
15:48 🔗 SketchCow It was mostly related to not shaking up the poor cluster of machines to do this new thing.
15:48 🔗 SketchCow Until, I guess, a more powerful set happens AND has an opportunity to update.
16:02 🔗 underscor Yeah, basically what SketchCow said. Adding things to the workers won't happen until we roll out a new set
17:02 🔗 schbiridi is there some linux tool/script to nicely archive websites via their rss feeds?
18:11 🔗 obscure__ Hey sketch, you around?
18:46 🔗 DFJustin BCHUNK apparently supports .wav instead of .cdr output, might be more ia-friendly
18:58 🔗 schbiridi it does, i use that all the time
19:27 🔗 schbiridi just saw jason's tweet, http//mdf2iso.berlios.de is nice too
19:27 🔗 schbiridi i also have mdfextract installed but do not recall using it
21:29 🔗 riordan Know I'm probably late to the party on this, but is there any desire to archive The Daily?
21:31 🔗 chronomex the daily what?
21:32 🔗 riordan Newscorp's "Revolutionary" ipad-only newspaper
21:32 🔗 riordan Articles were posted to a web CMS but could only be accessed when sent from a subscriber
21:32 🔗 chronomex sure, why not
21:33 🔗 riordan Andy Baio started it back in 2011 when it started
21:33 🔗 chronomex The Daily is also the name of my alma mater's newspaper, which happens to be the oldest news periodical in the state
21:34 🔗 riordan hah
21:36 🔗 riordan http://waxy.org/2011/02/the_daily_indexed/
21:37 🔗 riordan Bascially Steve Jobs made Murdoch fall in love with the iPad, convinced it it'd save the business model of newspapers. It didn't.
21:38 🔗 riordan But Murdoch hired some of the smartest writers and locked their content up so almost nobody was able to read it
21:38 🔗 balrog_ yup
21:39 🔗 riordan It'd be a shitty process; probably involve people with iPads subscribing or using the week free trial, grabbing as much of the backlog as possible, and sharing as many article links per issue as they could to an email address (or set of email accounts) that we could use to get the links to the stories and scrape them
21:40 🔗 riordan http://www.theatlantic.com/technology/archive/2012/12/3-theses-about-the-dailys-demise/265842/
21:40 🔗 ersi Grab anything you can
21:40 🔗 ersi If you can
21:40 🔗 balrog_ "It's also worth noting that Google's slowly indexing all the articles too, and search engines aren't blocked in their robots.txt file."
21:41 🔗 riordan balrog_: not anymore
21:41 🔗 riordan Their robots.txt now blocks robots
21:41 🔗 balrog_ riordan: they're still showing up in google
21:41 🔗 riordan rly?
21:42 🔗 balrog_ als, http://www.thedaily.com/robots.txt
21:42 🔗 balrog_ maybe they *once* had excluded ia_archiver?
21:42 🔗 riordan oh damn
21:42 🔗 riordan looks like they already pulled the app out
21:42 🔗 balrog_ that or they did "If you cannot put a robots.txt file up, read our exclusion policy. If you think it applies to you, send a request to us at info@archive.org."
21:42 🔗 riordan so it's all moot
21:42 🔗 balrog_ they're still google indexed at least
21:43 🔗 balrog_ site:thedaily.com inurl:page/2012 and site:thedaily.com inurl:page/2011
21:44 🔗 riordan For some reason I'm not getting anything from the google index for site:thedaily.com
21:45 🔗 balrog_ I get 42700 results
21:45 🔗 riordan hmmm
21:45 🔗 balrog_ with just the query "site:thedaily.com"
21:45 🔗 balrog_ (no quotes)
21:46 🔗 riordan hmmm - well something up what google's serving to me then
21:46 🔗 balrog_ try in another browser
21:46 🔗 riordan got it
21:46 🔗 balrog_ ok
21:49 🔗 riordan and they've got their recent material in a sitemap
21:49 🔗 riordan http://www.thedaily.com/sitemap-news.xml.gz
21:49 🔗 balrog_ only the past month or so
21:51 🔗 riordan Unlike most newspapers, which get sucked up into lexisnexis, this thing's probably going nowhere
21:52 🔗 balrog_ yeah I'm afraid so
21:53 🔗 riordan I've got a friend who knows their editor in chief - I'll see if I can pass a message along to ask if there's a plan of succession for content?
21:53 🔗 balrog_ it would be nice if the old content could at least be archived
21:53 🔗 riordan precisely
21:54 🔗 chronomex I like how the website is just images of text
21:54 🔗 chronomex that's A+ CMS right there
21:55 🔗 ersi AAA+
21:56 🔗 riordan a+++ business model. Would fail again

irclogger-viewer