[00:11] anyone here archive DeviantArt ? [01:09] idea: archive team runs a url shortener [01:09] that redirects you to goatse [01:09] hahaha [01:10] but only after an hour [01:11] looking for some AOL info, I found this page: http://bryanmarsh.tripod.com/bryanmarshswebpage/id16.html [02:20] xmc: actually i was thinking 24 hours [02:21] no, that makes it too useful [02:24] i guess [02:24] Super-legit [02:25] goaturl.com is available [02:30] so is gÅt.ly [02:33] daw... you can't have international characters in .ly domains :-( [02:39] You can barely have international characters IN .ly [06:06] https://vpsboard.com/topic/2879-36tb-raw-backup-server-for-cheap/ cc midas :P [08:38] midas: did you end up finding a scanner [08:41] well I came into this channel when searching for warc tools. Is this channel also for wayback related questions? :p [08:41] or discussion.. [08:41] just want to clarify before I throw my questions in the room haha [09:04] http://www.icanbarelydraw.com/comic/comics/2013-05-27-entire-wikipedia.png [09:05] ratem: that depends [09:05] people here don't _run_ the wayback, but they might have an answer to your question regardless [09:05] :p [09:29] wow [09:30] this CD is damaged os badly [09:30] lol [09:30] that at first it was recognized as a blank CD [09:46] ah okay. thx joepie91 ;) [10:15] Taking a timeout of 2 minutes to protect the hardware. [10:15] The drive is spinning for more than 30 minutes. [10:16] ... did not expect that [10:18] ..what happened? [10:20] GLaDOS: rubyripper + cdparanoia [10:20] apparently they're paranoid enough to make my drive/disc rest [10:20] heh [10:20] to avoid read errors [10:20] hah [10:21] working on ripping a pile of CDs from various sources atm [10:21] I want 100% accurate (securely ripped) FLAC [10:21] and I'm scanning/cropping covers also [10:21] rubyripper + cdparanoia + scantools + gthumb = <3 [10:21] though I really do miss my LiDE scanner, the CDs are too reflective in this scanner :/ [10:22] most of these CDs have absolutely terrible music [10:22] :P [10:22] but hey, it's for a good cause [10:24] * joepie91 is literally prying CDs off the glass plate.. [10:25] Some track(s) could NOT be corrected within the maximum amount of trials [10:25] Track 20 could NOT be corrected completely [10:25] :( [12:03] okay, so I've done a little test/comparison [12:03] got transparent document sleeve kind of things [12:03] compared normal CD scan with scan of a CD in a document sleeve [12:03] to see if it would be feasible to have a "CD holder" without loss of scan quality [12:04] (to speed up scanning) [12:04] currently uploading comparison... [12:05] http://cryto.net/~joepie91/scan_comparison.png [12:05] there [12:05] right: normal scan [12:05] left: scan in 'sleeve' [12:05] if anything, it seems to even out the lighting more :P [13:08] my new CD scanning sleeve: http://cryto.net/~joepie91/0019.png :D [13:15] Interesting method :) [13:31] norbert79: it speeds up things quite a bit, and lets me realign them before scanning [13:31] without them floating all over the glass plate [13:33] I just put my CD's into a scanner and then do the correction of rotations and small issues manually in GIMP [13:38] norbert79; not feasible at this scale :P [13:39] takes far too much time [13:39] * joepie91 stares at a shelf full of CDs [13:47] I wonder if I shall preserve that copy of the CD ROM I won on a Pepsi game containing one song from the Spice Girls... [13:47] from what... 1999? 2000? [13:47] or 2003? [13:47] Can't recall [13:48] it literally had only one song [13:50] norbert79: gThumb is awesome for processing CD/cover scans, btw [13:50] norbert79: yes! [13:50] and I think I remember that CD [13:53] norbert79: http://owely.com/3EUlI4 [13:53] I was soo happy and so disappointed afterwards [13:53] that's the first batch :) [13:53] lol [13:53] don't ask me where those CDs come from, half of them come from my grandmother, the other half come from piles *somewhere* [13:53] :P [13:54] hah [13:54] You have quite an impressive collection of old CD-ROMs considering your age [13:54] Mine are mostly bought [13:55] or won [13:56] nice rack [13:56] I mean CD-ROM collection [13:57] i have about another 44 maximum pc cds/dvds to scan at some point [13:58] also all but of 2006 lki (best computer games) is uploaded now [13:58] lki 50 i think was remade into a iso [13:58] rutracker has tons of just folder dumps of the discs [13:59] MY CD's are mostly mixed tracks, so MDF is the only probably format for me [13:59] My [14:01] my premium playlist dump of abcnews is almost done [14:10] this disc was so damaged that it literally froze my PC trying to mount it [14:10] norbert79: lol [14:10] norbert79: I have a lot of self-burnt stuff [14:10] but also a good amount of budget bin games/movies [14:10] including stuff that basically no online downloads exist of [14:10] and that's unlikely to ever be published again [14:10] :) [14:11] but yeah [14:11] best open format for imaging CDs/DVDs, possibly multi-session, possibly copy-protected? [14:14] aw yeah [14:14] court decided that Dutch ISPs don't have to block TPB! [14:27] I am amazed at how long these CDs lasted while still being readable [14:37] mount: wrong fs type, bad option, bad superblock on /dev/sr0, missing codepage or helper program, or other error. In some cases useful info is found in syslog - try dmesg | tail or so [14:37] * joepie91 puts on "to recover" stack [14:39] hey look: https://archive.org/details/AndreRieu-StraussGala [14:47] So anything special I need to add when uploading a cd? [14:50] joepie91: i found one, but it's in use [14:51] so not really able to send it [14:59] https://scontent-b.xx.fbcdn.net/hphotos-prn1/t1/1549500_599386523475056_613022073_n.jpg [14:59] it says: 20 hour free internet access! [15:04] midas: :( [15:04] Dud1: audio CD or CD-ROM? [15:04] midas : reminds me of compuserv [15:04] :) [15:04] cd [15:04] compuserve * [15:05] Dud1: that... was not one of the possible options [15:05] :P [15:05] Sorry CD-ROM [15:08] Dud1: for a single-session CD-ROM, an ISO image should be sufficient [15:08] I'm not sure wrt multi-session CD-ROM, actually asked a similar question a bit earlier [15:09] I meant in terms of uploading, is there anything specific I add? [15:16] Dud1: not aside from what I said above, though you'll want to add a cover / disc scan if you have it [15:16] :) [15:16] unrelated, http://njw.me.uk/getxbook/ [15:16] getxbook is a collection of tools to download books from Google Books' "Book Preview" (getgbook), Amazon's "Look Inside the Book" (getabook) and Barnes & Noble's "Book Viewer" (getbnbook). [15:25] http://www.youtube.com/watch?v=ZDXuPQ9ML9E [15:26] its was something on Glenn beck show yesterday [16:12] only 23k urls with my abcnews grab [16:14] also plus side is i found more data on the old CBS Evening News podcast: https://web.archive.org/web/20070601000000*/http://www.cbsnews.com/stories/2007/01/29/podcast_eveningnews/main2407226.shtml [16:14] this at least will give me links to parts that are missing mostly [16:15] some links though don't have a file path [16:15] august 31 and september 8 are example of that [17:12] man, this weather in the Midwest US sucks [17:12] why must it be so goddamn cold [17:13] yet another fanfiction saved by ia [17:14] it wasn't in the 2012 ff.net dump [17:14] but i found it on another site [17:15] in the archive [17:15] because it is now erased from the web [17:15] \o/ [17:16] look like 2008-2012 were a hard period on fanfic [17:17] ~80-90% of the most interesting stories were deleted [18:21] It was [18:30] I wonder if it was an organized thing, or just a bunch of people got to a certain age [20:18] I am starting to get why people are saying linux is better than windows. One file run on windows returns nothing, but the same file on linux returns the proper result... [20:20] One file run? [20:25] *I run a python one file to grab some information from a page. [20:27] Python programs should work the same on Linux and Windows unless someone did something wrong [20:28] if you paste it somewhere I might be able to point out the problem [20:30] http://pastebin.com/th9B6T57 [20:32] that should definitely work the same [20:32] how did you run it on Windows? [20:32] python printtable.py [20:33] maybe you had an older/newer version of bs4 on Windows [20:34] I have 4.3.2 on windows and the same on linux [20:34] hm, can you send me information.html? [20:36] FTR, Google's data export thing-a-ma-bob is bullshit. [20:36] Trying to export a single mail label that has 10 messages in it: "We're preparing your archive - Data collected: 1.3 GB" [20:37] And a couple of hours ago they send me a message that it was ready to download... which it's not. [20:37] So angry. >:-( [20:38] I use Thunderbird to get a copy of my mail [20:38] there's also offlineimap for advanced use cases [20:39] ivan: http://pastebin.com/zuBiP6ND <-There [20:41] Dud1: https://ludios.org/tmp/bs4win.png [20:48] my 64-bit Python 2.7.2 works as well [20:49] Would have been very helpful to realise this a few days ago :/ [20:50] did you figure it out? [20:51] are you sure information.html in whatever directory you're running in is what you think it is? [20:52] As in figure out it's something wrong with my laptop. I'll use linux so. [20:52] Yup I'm sure. [20:53] If it wasn't there would be errno 2 appearing [22:09] i'm uploading the abcnews.go.com video metadata [22:14] its being uploaded here: https://archive.org/details/abcnews.go.com-video-metadata-20140126 [22:15] SketchCo1: I will be uploading abcnews meta sitemap also [22:16] this should allow us a good way to make a full grab of abcnews.go.com stories [22:26] SketchCow: video metadata is fully uploaded now [23:02] uploaded: https://archive.org/details/abcnews.go.com-meta-sitemap-20140126 [23:22] joepie91: for my CD uploads, I started with just ISO, but someone said to do bin/cue as well, and upload both sets [23:22] bin/cue is required for anything that's not single-track data [23:26] iso is the only type browsable at IA currently, but yes bin/cue is required in many cases [23:54] hello all, not sure I'm welcome here but just wanted to mention this list of FTP sites I'm looking at / archiving / have archived: http://futuramerlin.com/d/s/wwk/dokuwiki/doku.php?id=archival:ftp#ftp_sites_in_need_of_archival_being_archived