#archiveteam-bs 2014-01-28,Tue

↑back Search

Time Nickname Message
00:11 🔗 nico_32 anyone here archive DeviantArt ?
01:09 🔗 RedType_ idea: archive team runs a url shortener
01:09 🔗 RedType_ that redirects you to goatse
01:09 🔗 DFJustin hahaha
01:10 🔗 xmc but only after an hour
01:11 🔗 dashcloud looking for some AOL info, I found this page: http://bryanmarsh.tripod.com/bryanmarshswebpage/id16.html
02:20 🔗 RedType_ xmc: actually i was thinking 24 hours
02:21 🔗 xmc no, that makes it too useful
02:24 🔗 RedType_ i guess
02:24 🔗 SketchCow Super-legit
02:25 🔗 RedType_ goaturl.com is available
02:30 🔗 SadDM so is gōt.ly
02:33 🔗 SadDM daw... you can't have international characters in .ly domains :-(
02:39 🔗 SketchCow You can barely have international characters IN .ly
06:06 🔗 joepie91 https://vpsboard.com/topic/2879-36tb-raw-backup-server-for-cheap/ cc midas :P
08:38 🔗 joepie91 midas: did you end up finding a scanner
08:41 🔗 ratem well I came into this channel when searching for warc tools. Is this channel also for wayback related questions? :p
08:41 🔗 ratem or discussion..
08:41 🔗 ratem just want to clarify before I throw my questions in the room haha
09:04 🔗 joepie91 http://www.icanbarelydraw.com/comic/comics/2013-05-27-entire-wikipedia.png
09:05 🔗 joepie91 ratem: that depends
09:05 🔗 joepie91 people here don't _run_ the wayback, but they might have an answer to your question regardless
09:05 🔗 joepie91 :p
09:29 🔗 joepie91 wow
09:30 🔗 joepie91 this CD is damaged os badly
09:30 🔗 joepie91 lol
09:30 🔗 joepie91 that at first it was recognized as a blank CD
09:46 🔗 ratem ah okay. thx joepie91 ;)
10:15 🔗 joepie91 Taking a timeout of 2 minutes to protect the hardware.
10:15 🔗 joepie91 The drive is spinning for more than 30 minutes.
10:16 🔗 joepie91 ... did not expect that
10:18 🔗 GLaDOS ..what happened?
10:20 🔗 joepie91 GLaDOS: rubyripper + cdparanoia
10:20 🔗 joepie91 apparently they're paranoid enough to make my drive/disc rest
10:20 🔗 joepie91 heh
10:20 🔗 joepie91 to avoid read errors
10:20 🔗 GLaDOS hah
10:21 🔗 joepie91 working on ripping a pile of CDs from various sources atm
10:21 🔗 joepie91 I want 100% accurate (securely ripped) FLAC
10:21 🔗 joepie91 and I'm scanning/cropping covers also
10:21 🔗 joepie91 rubyripper + cdparanoia + scantools + gthumb = <3
10:21 🔗 joepie91 though I really do miss my LiDE scanner, the CDs are too reflective in this scanner :/
10:22 🔗 joepie91 most of these CDs have absolutely terrible music
10:22 🔗 joepie91 :P
10:22 🔗 joepie91 but hey, it's for a good cause
10:24 🔗 * joepie91 is literally prying CDs off the glass plate..
10:25 🔗 joepie91 Some track(s) could NOT be corrected within the maximum amount of trials
10:25 🔗 joepie91 Track 20 could NOT be corrected completely
10:25 🔗 joepie91 :(
12:03 🔗 joepie91 okay, so I've done a little test/comparison
12:03 🔗 joepie91 got transparent document sleeve kind of things
12:03 🔗 joepie91 compared normal CD scan with scan of a CD in a document sleeve
12:03 🔗 joepie91 to see if it would be feasible to have a "CD holder" without loss of scan quality
12:04 🔗 joepie91 (to speed up scanning)
12:04 🔗 joepie91 currently uploading comparison...
12:05 🔗 joepie91 http://cryto.net/~joepie91/scan_comparison.png
12:05 🔗 joepie91 there
12:05 🔗 joepie91 right: normal scan
12:05 🔗 joepie91 left: scan in 'sleeve'
12:05 🔗 joepie91 if anything, it seems to even out the lighting more :P
13:08 🔗 joepie91 my new CD scanning sleeve: http://cryto.net/~joepie91/0019.png :D
13:15 🔗 norbert79 Interesting method :)
13:31 🔗 joepie91 norbert79: it speeds up things quite a bit, and lets me realign them before scanning
13:31 🔗 joepie91 without them floating all over the glass plate
13:33 🔗 norbert79 I just put my CD's into a scanner and then do the correction of rotations and small issues manually in GIMP
13:38 🔗 joepie91 norbert79; not feasible at this scale :P
13:39 🔗 joepie91 takes far too much time
13:39 🔗 * joepie91 stares at a shelf full of CDs
13:47 🔗 norbert79 I wonder if I shall preserve that copy of the CD ROM I won on a Pepsi game containing one song from the Spice Girls...
13:47 🔗 norbert79 from what... 1999? 2000?
13:47 🔗 norbert79 or 2003?
13:47 🔗 norbert79 Can't recall
13:48 🔗 norbert79 it literally had only one song
13:50 🔗 joepie91 norbert79: gThumb is awesome for processing CD/cover scans, btw
13:50 🔗 joepie91 norbert79: yes!
13:50 🔗 joepie91 and I think I remember that CD
13:53 🔗 joepie91 norbert79: http://owely.com/3EUlI4
13:53 🔗 norbert79 I was soo happy and so disappointed afterwards
13:53 🔗 joepie91 that's the first batch :)
13:53 🔗 joepie91 lol
13:53 🔗 joepie91 don't ask me where those CDs come from, half of them come from my grandmother, the other half come from piles *somewhere*
13:53 🔗 joepie91 :P
13:54 🔗 norbert79 hah
13:54 🔗 norbert79 You have quite an impressive collection of old CD-ROMs considering your age
13:54 🔗 norbert79 Mine are mostly bought
13:55 🔗 norbert79 or won
13:56 🔗 norbert79 nice rack
13:56 🔗 norbert79 I mean CD-ROM collection
13:57 🔗 godane i have about another 44 maximum pc cds/dvds to scan at some point
13:58 🔗 godane also all but of 2006 lki (best computer games) is uploaded now
13:58 🔗 godane lki 50 i think was remade into a iso
13:58 🔗 godane rutracker has tons of just folder dumps of the discs
13:59 🔗 norbert79 MY CD's are mostly mixed tracks, so MDF is the only probably format for me
13:59 🔗 norbert79 My
14:01 🔗 godane my premium playlist dump of abcnews is almost done
14:10 🔗 joepie91 this disc was so damaged that it literally froze my PC trying to mount it
14:10 🔗 joepie91 norbert79: lol
14:10 🔗 joepie91 norbert79: I have a lot of self-burnt stuff
14:10 🔗 joepie91 but also a good amount of budget bin games/movies
14:10 🔗 joepie91 including stuff that basically no online downloads exist of
14:10 🔗 joepie91 and that's unlikely to ever be published again
14:10 🔗 joepie91 :)
14:11 🔗 joepie91 but yeah
14:11 🔗 joepie91 best open format for imaging CDs/DVDs, possibly multi-session, possibly copy-protected?
14:14 🔗 joepie91 aw yeah
14:14 🔗 joepie91 court decided that Dutch ISPs don't have to block TPB!
14:27 🔗 joepie91 I am amazed at how long these CDs lasted while still being readable
14:37 🔗 joepie91 mount: wrong fs type, bad option, bad superblock on /dev/sr0, missing codepage or helper program, or other error. In some cases useful info is found in syslog - try dmesg | tail or so
14:37 🔗 * joepie91 puts on "to recover" stack
14:39 🔗 joepie91 hey look: https://archive.org/details/AndreRieu-StraussGala
14:47 🔗 Dud1 So anything special I need to add when uploading a cd?
14:50 🔗 midas joepie91: i found one, but it's in use
14:51 🔗 midas so not really able to send it
14:59 🔗 midas https://scontent-b.xx.fbcdn.net/hphotos-prn1/t1/1549500_599386523475056_613022073_n.jpg
14:59 🔗 midas it says: 20 hour free internet access!
15:04 🔗 joepie91 midas: :(
15:04 🔗 joepie91 Dud1: audio CD or CD-ROM?
15:04 🔗 joepie91 midas : reminds me of compuserv
15:04 🔗 joepie91 :)
15:04 🔗 Dud1 cd
15:04 🔗 joepie91 compuserve *
15:05 🔗 joepie91 Dud1: that... was not one of the possible options
15:05 🔗 joepie91 :P
15:05 🔗 Dud1 Sorry CD-ROM
15:08 🔗 joepie91 Dud1: for a single-session CD-ROM, an ISO image should be sufficient
15:08 🔗 joepie91 I'm not sure wrt multi-session CD-ROM, actually asked a similar question a bit earlier
15:09 🔗 Dud1 I meant in terms of uploading, is there anything specific I add?
15:16 🔗 joepie91 Dud1: not aside from what I said above, though you'll want to add a cover / disc scan if you have it
15:16 🔗 joepie91 :)
15:16 🔗 joepie91 unrelated, http://njw.me.uk/getxbook/
15:16 🔗 joepie91 getxbook is a collection of tools to download books from Google Books' "Book Preview" (getgbook), Amazon's "Look Inside the Book" (getabook) and Barnes & Noble's "Book Viewer" (getbnbook).
15:25 🔗 godane http://www.youtube.com/watch?v=ZDXuPQ9ML9E
15:26 🔗 godane its was something on Glenn beck show yesterday
16:12 🔗 godane only 23k urls with my abcnews grab
16:14 🔗 godane also plus side is i found more data on the old CBS Evening News podcast: https://web.archive.org/web/20070601000000*/http://www.cbsnews.com/stories/2007/01/29/podcast_eveningnews/main2407226.shtml
16:14 🔗 godane this at least will give me links to parts that are missing mostly
16:15 🔗 godane some links though don't have a file path
16:15 🔗 godane august 31 and september 8 are example of that
17:12 🔗 yipdw man, this weather in the Midwest US sucks
17:12 🔗 yipdw why must it be so goddamn cold
17:13 🔗 nico yet another fanfiction saved by ia
17:14 🔗 nico it wasn't in the 2012 ff.net dump
17:14 🔗 nico but i found it on another site
17:15 🔗 nico in the archive
17:15 🔗 nico because it is now erased from the web
17:15 🔗 Smiley \o/
17:16 🔗 nico look like 2008-2012 were a hard period on fanfic
17:17 🔗 nico ~80-90% of the most interesting stories were deleted
18:21 🔗 SketchCow It was
18:30 🔗 xmc I wonder if it was an organized thing, or just a bunch of people got to a certain age
20:18 🔗 Dud1 I am starting to get why people are saying linux is better than windows. One file run on windows returns nothing, but the same file on linux returns the proper result...
20:20 🔗 ivan` One file run?
20:25 🔗 Dud1 *I run a python one file to grab some information from a page.
20:27 🔗 ivan` Python programs should work the same on Linux and Windows unless someone did something wrong
20:28 🔗 ivan` if you paste it somewhere I might be able to point out the problem
20:30 🔗 Dud1 http://pastebin.com/th9B6T57
20:32 🔗 ivan` that should definitely work the same
20:32 🔗 ivan` how did you run it on Windows?
20:32 🔗 Dud1 python printtable.py
20:33 🔗 ivan` maybe you had an older/newer version of bs4 on Windows
20:34 🔗 Dud1 I have 4.3.2 on windows and the same on linux
20:34 🔗 ivan` hm, can you send me information.html?
20:36 🔗 SadDM FTR, Google's data export thing-a-ma-bob is bullshit.
20:36 🔗 SadDM Trying to export a single mail label that has 10 messages in it: "We're preparing your archive - Data collected: 1.3 GB"
20:37 🔗 SadDM And a couple of hours ago they send me a message that it was ready to download... which it's not.
20:37 🔗 SadDM So angry. >:-(
20:38 🔗 ivan` I use Thunderbird to get a copy of my mail
20:38 🔗 ivan` there's also offlineimap for advanced use cases
20:39 🔗 Dud1 ivan: http://pastebin.com/zuBiP6ND <-There
20:41 🔗 ivan` Dud1: https://ludios.org/tmp/bs4win.png
20:48 🔗 ivan` my 64-bit Python 2.7.2 works as well
20:49 🔗 Dud1 Would have been very helpful to realise this a few days ago :/
20:50 🔗 ivan` did you figure it out?
20:51 🔗 ivan` are you sure information.html in whatever directory you're running in is what you think it is?
20:52 🔗 Dud1 As in figure out it's something wrong with my laptop. I'll use linux so.
20:52 🔗 Dud1 Yup I'm sure.
20:53 🔗 Dud1 If it wasn't there would be errno 2 appearing
22:09 🔗 godane i'm uploading the abcnews.go.com video metadata
22:14 🔗 godane its being uploaded here: https://archive.org/details/abcnews.go.com-video-metadata-20140126
22:15 🔗 godane SketchCo1: I will be uploading abcnews meta sitemap also
22:16 🔗 godane this should allow us a good way to make a full grab of abcnews.go.com stories
22:26 🔗 godane SketchCow: video metadata is fully uploaded now
23:02 🔗 godane uploaded: https://archive.org/details/abcnews.go.com-meta-sitemap-20140126
23:22 🔗 dashcloud joepie91: for my CD uploads, I started with just ISO, but someone said to do bin/cue as well, and upload both sets
23:22 🔗 dashcloud bin/cue is required for anything that's not single-track data
23:26 🔗 DFJustin iso is the only type browsable at IA currently, but yes bin/cue is required in many cases
23:54 🔗 kyan hello all, not sure I'm welcome here but just wanted to mention this list of FTP sites I'm looking at / archiving / have archived: http://futuramerlin.com/d/s/wwk/dokuwiki/doku.php?id=archival:ftp#ftp_sites_in_need_of_archival_being_archived

irclogger-viewer