[00:02] AOL Music has been shot in the head. http://www.archiveteam.org/index.php?title=Aol_music [00:28] aol & yahoo [00:28] destroyers of data [00:48] the government tells itself to take down a video: http://boingboing.net/2013/04/26/us-government-sends-itself-a-t.html [01:06] okay, question: [01:06] I have been keeping an archive at http://ahdjs.archive.cryto.net/ of livesets that have been broadcast on afterhoursdjs (an internet radio station), of which afaik no public downloads exist [01:07] I'm frequently ripping the stream to obtain more shows, and I have a pile of older recordings (2003-2007) on a few older HDDs here [01:07] would this be suitable for inclusion in IA? [01:07] yes [01:07] do it [01:07] and next question: what's the fastest way to do this, without having to manually mess around with metadata, while still including plenty of metadata? all filenames are prefixed with the date of broadcast, if that helps [01:08] using the ias3 upload script [01:08] (some also with broadcast time) [01:08] you fill in a few fields on a csv and the script handles the rest [01:08] right, guess I'll have to play with the CSV stuff again then :P [01:08] I uploaded 100gb today with it so far [01:08] at least I'm a bit more familiar with how IA works, after a few manual submissions [01:08] let me give you a paste of my csv [01:09] http://paste.archivingyoursh.it/tejuwujemi.avrasm - top line is the header and the next two lines create an item and add 2 files [01:10] You only need to edit the first 4 fields and leave the rest to the defaults [01:10] including webcrawl? [01:10] and archiveteam? [01:10] those are tags [01:10] oh, right [01:10] I see [01:11] thanks [01:11] does the IAS3 script have some kind of 'emulation mode' where it just gives a visual output of what it would upload, without actually uploading it yet? [01:11] so I can test around with it a bit [01:12] (dry run, basically) [01:12] there is a test collection you can upload to on IA [01:12] they erase it once a day [01:12] alright, how would I upload to that from that CSV? :P [01:12] and start with like 2-3 files instead of all of them [01:12] just change collection field to 'test'? [01:12] I don't remember [01:13] (also, not very concerned about bandwidth etc., the VPS that archive runs on is provided to me free of charge without traffic restrictions :) [01:13] in my experience it's substantially less than once per day [01:14] it's https://archive.org/details/test_collection [01:14] that uses up global IDs too so don't use the item name you want to use for the final thing [01:15] DFJustin: should I use that entire URL as collection identifier, or just the 'test_collection' part? [01:15] and, alright :P [01:15] just test_collection [01:16] okay [01:16] thanks! [01:16] I'll probably be spitting through my old HDDs in the next few weeks [01:16] and upload all the random archivable crap that I find [01:16] heh [01:16] I still have files from literally the first time I used a PC [01:17] * joepie91 has always been hoarding data [01:25] uh.. omf_, how do I add a comma in the description without making the CSV parser go derp? [01:25] does it support "" syntax? [01:26] no idea. I keep it bare bones and edit it later online [01:28] here are the best docs I know of https://github.com/kngenie/ias3upload [01:41] i found 4122 lost techtv videos [01:42] :-D [01:42] Smiley: did you guys figure out why i'm pulling claims from the tracker and not being able to submit? [01:48] 44 minutes to upload 20gb to the IA just now. No more back offs or timeouts. [01:49] i can't wait to absolutely destroy aol's servers [01:51] omf_, writing a script to go through the files automatically and generate a CSV, do you think this is a good "item title" format? http://sprunge.us/aCKV [01:52] you cannot have : or / in an item name [01:52] also cc others :P [01:52] oh? [01:52] wait, you mean the URL? [01:52] item names must be at least 4 characters long and less than 100 [01:52] I'm talking about the title [01:53] the visible title on the page :P [01:53] I never put the http:// in title mysql [01:54] * joepie91 is confused [01:54] mysql [01:54] myself [01:54] fuck [01:54] spelling [01:54] here is an example: http://archive.org/details/frontaalnaakt.nl [01:55] I'm still not sure I understand what you're trying to say :P [01:55] that might be because I should be sleeping in not too long, however [01:56] you really going to have this: AfterHoursDJs.org Liveset [01:56] in every single title [01:57] I would make that the collection name which can be done later [01:57] hm, true [01:57] leave the date though [01:57] just trying to make it obvious that it's an ahdjs broadcast [01:57] (and the .org is actually part of the name, technically :P) [01:57] betamaxdj - january (betamaxdj, 01 Jan 2010) [01:57] alright [01:58] http://sprunge.us/AMUQ [01:58] that looks good [01:58] right, I'll go with that then :) [01:59] yeah, I guess I'll go sleep now [01:59] and continue on this tomorrow [01:59] thanks for the help! [02:17] Smiley: alard it seems that if i go 500 concurrent or higher, i stop showing up on the leaderboard [02:18] at 500+, i'm still sending stuff to the rsync server but the tracker tells me to f off on the stats send i can still recieve claims though [02:22] anyone? [02:24] 300 works [03:03] and i broke it again [03:04] asyncpopen http://pastebin.ca/2367730 [03:04] no http response from tracker [03:33] can someone get the eXoDOS volumes on underground gamer? [03:34] there very big and have like all the dos and pc booter games i think [03:34] eXoDOS is my attempt to catalog, obtain, and make playable every game developed for the DOS and PC Booter platform.  I strive to find original media rather than using scene rips. This collection uses a combination of Dosbox and ScummVM to play these older titles on modern systems.  Both emulators are included in the torrent and have been setup to run all included titles with no prior knowledge or experience required on the us [03:35] SketchCow: i think is is something you want [03:37] it includes manuals, art and game info of each title too [03:39] i think all 5 volumes are at least 260gb [03:51] Did duggan reply to me? [04:08] can anyone teill me how to add more then one line to desc when uploading with s3 api? [04:09] add a
maybe? [06:27] i'm uploading BBC Click Bot Net Special [06:35] has anyone here even had a problem with rsync just stopping at "sending incremental file list", there is no network activity, no disk I/O, no CPU usage so I have no idea what is going on [07:09] adding some duke 3d and quake level cds to keep the doom ones company [08:10] godane: there are much better attempts on game archival on u-g. exodos is not a good historical collection from what i know [08:10] DFJustin: link me! [11:44] has anyone here even had a problem with rsync just stopping at "sending incremental file list", there is no network activity, no disk I/O, no CPU usage so I have no idea what is going on [11:44] this typically means that rsync has not yet encountered anything to transfer [11:44] it should have a kb/sec of transfer at most while it compares filenames and such [11:44] perhaps if you tell it to be verbose, it'll tell you more? [11:45] I ended up getting it working somehow [11:46] but even verbose it just stopped at listing files for the folder [12:08] The archive's three bay area data centres use 180 kilowatts, the equivalent of 45 homes [12:08] that is incredibly much less power usage than I expected [12:11] Someday they will go solar [12:12] Think about that. Everyone is all solar power for the home but businesses would get their money back much faster [12:12] omf_: that actually seems very attainable [12:12] Some of the local factories have started adding panels to their roof [12:12] with usage this low [12:12] for IA I mean [12:14] Apple, Google, Netapp and others are already rolling out more solar panels [12:14] cover everything in panels [12:14] exactly [12:14] Hell, wasn't there some study that basically said at 85% efficenty you could cover one desert with panels and fix the worlds power issues [12:14] the problem is maintiance on that scale [12:14] they just had another break through in paint on solar cells [12:14] however, do it google style, cover 3 deserts ;) [12:14] Smiley, they just need to break 40% [12:15] we are at 21% [12:15] And obv. storage of that power until needed..... flywheels? Use the earth as a giant flywheel? [12:17] At 40%+ the power output is greater than the R value of burning coal. That makes it more cost effective and more efficient [12:17] ok, so what is happening in archiveteam world also? [12:17] another site is going down - http://www.archiveteam.org/index.php?title=Aol_music [12:17] I already embrace our solar overlords [12:33] SketchCow, curiosity question: how did Cuil get those 310TB of data to IA? shipping a physical box of HDDs? [13:03] :D [13:03] omf_, my CSV: http://sprunge.us/BEDC?csv [13:04] all auto-generated [13:04] testing time! [13:09] * joepie91 wonders why upload is so slow [13:10] whoa [13:10] I think my ISP uncapped my upload [13:10] it's now 60+ mbps [13:10] joepie91: spaces in file might fail? [13:10] nah, seems to work [13:11] it's just slooooow [13:11] like 200KB/sec max [13:11] my upload is 60mbps so the bottleneck is either in ias3upload, or on the S3 IA box [13:12] especially since I've been uploading to IA via the HTML5 uploader at >2MB/sec [13:12] so clearly it's not the distance :P [13:17] yeah you prob got backed off, I'm unsure exactly how that works, but it sounds like it [13:17] How are you uploading btw? [13:17] * Smiley needs to get the s3 upload stuff.... [13:20] Re: Solar panels [13:21] They've found a way to get %200 efficiency with solar panels [13:23] ok, so who knows how to submit stuff to IA via commandline? [13:24] GLaDOS: I want to do it from my account on anarchive, got any pointers? [13:24] I am doing it right now Smiley [13:24] I have my api key and passphase thing. [13:24] Smiley: ias3upload? [13:24] need to start clearing space, the metadata.csv are ready [13:24] yes [13:24] How to do it? :D [13:24] i dunno lol [13:24] https://github.com/kngenie/ias3upload [13:25] Look at all that readme! [13:25] ty [13:25] nothing on the box already apart from what omf_ has done then? [13:25] mah internets here incredibly slow atm, can't even edit files safely P:< [13:27] I go the config file route [13:27] created a file in your homedir .ias3cfg [13:27] first line [13:27] access_key = [13:27] secret_key = [13:27] and that is it [13:28] :D ok [13:28] also you want ./ias3upload.pl --no-derive [13:28] omf_: the collection already set to archiveteam? [13:28] we cannot do that [13:28] it goes into community and gets moved later [13:29] Ah ok. [13:31] How are you uploading btw? [13:31] iasupload3, from home PC [13:31] using auto-generated CSV [13:31] kind of. [13:31] no. [13:31] manually written csv. [13:31] Guys, does setting an IPv6 tunnel up and using IRC over certain addresses like ::c0ff:ee get you women? [13:32] all the time. [13:32] Ah, well then. [13:33] I should shift all my IRC connections from some VPS back to my old laptop (a psuedo server now) now.. [13:35] File: s.insiderdownloads.ign.com-2013-04-26.warc -> /s.insiderdownloads.ign.com/s.insiderdownloads.ign.com-2013-04-26.warc [13:35] Sent 54691 bytes (100%) [13:35] 200 Ok [13:35] upload almost done... [13:35] weeee [13:35] (of the first file) [13:35] 500 Can't use an undefined value as a SCALAR reference [13:35] Can't use an undefined value as a SCALAR reference at /usr/local/share/perl/5.10.1/LWP/Protocol/http.pm line 353. [13:35] o_O [13:36] It's broke, obviously. [13:36] http://archive.org/details/Test-Darthii-Live_Darthiis_Electro_House_Session_6-06Dec2010 :D [13:36] hm [13:36] seems like the date may be redundant [13:37] GLaDOS: D: [13:38] hmmm if I just kill the script will something implode or will it just dump it? [13:39] and I think the csv might of been missing a , all along :O [13:40] Try it. [13:40] I'm sure SketchCow would be happy to delete the failed upload [13:40] I'm sure he's not here right now :P [13:41] joepie91: did that show up in your uploads on your account page? [13:42] and fixing the csv fixed it [13:42] Ph34r my random guessing skills [13:42] Smiley: not yet, but there's always a delay [13:42] so I'll wait for a bit [13:42] yeah [13:43] File: ces2009.ign.com-2013-04-17.warc: skipping - no change since last upload [13:43] Seems it's ok :) [13:46] anyone remember where to see the queue of items? [13:46] Oh yey it shows up :) [13:46] http://archive.org/details/s.insiderdownloads.ign.com [13:54] okay [13:54] so this is the cheapest sheetfed A3 scanner I can find [13:54] http://tweakers.net/pricewatch/278530/brother-mfc-j6710dw-(nl-model)/specificaties/ [13:54] 225 euro, combi printer/scanner/etc [13:54] and it does actual 35-page-capacity sheetfed A3 scanning [13:55] nice [13:55] and it supposedly has some degree of Linux support [13:55] although it seems to have issues with newer Ubuntu versions [13:56] (but hey, it's Ubuntu, so not really a surprise there) [13:56] sigh [13:56] I kind of want this :C [13:56] also, I was considering an interesting project [13:57] a software package for archiving [13:57] that lets you create 'tasks' (archive CD/DVD, archive book, archive website, etc) [13:57] and then controls the appropriate bits of software [13:57] potentially spread over multiple machines [13:57] and then uploads the whole thing to IA via the API [13:58] theoretically, that would make it possible to set up an "archiving station" of some sort fairly easily [13:58] grab old Linux box, hook up a DVD writer, scanner, and an internet connection [13:58] and you're done [13:58] I am fixing up the ias3 code right now [13:58] The author put an open source license on the code [13:59] make it control scantailor and ddrescue, use pysane to talk to a scanner, and perhaps integrate archiveteam warrior code into it [13:59] it does some really dumb shit [13:59] omf_: lol [13:59] like? [13:59] implements its own broken csv and json parsers [13:59] :| [14:00] I haven't done either of those in over a decade [14:00] tbh [14:00] solved problems are just that SOLVED [14:00] whenever I read something like that [14:00] my instinctive response is [14:00] "REWRITE THE THING" [14:00] because there's probably more nasty hidden in the code [14:00] that is what static code analysis is for [14:00] Smiley: yes, the recording showed up in my account uploads [14:00] MFC-J6710DW [14:00] er [14:00] I help develop those kinds of tools [14:00] fail [14:00] http://archive.org/search.php?query=uploader%3A%22admin%40cryto.net%22&sort=-publicdate [14:01] omf_: static code analysis..? [14:01] joepie91, what is your fav language [14:01] depends [14:01] if we're just talking about the language itself, Python [14:01] if we're talking about usability, PHP (because docs) [14:02] think pylint then [14:02] * joepie91 hasn't used pylint [14:02] that is a static code analysis tool. It uses rules to determine if you are doing something stupid with your code [14:02] I see [14:02] Null pointers in C [14:03] non-unicode calls [14:03] you can set these tools to look for whatever you want [14:04] If an existing program mostly works which in the case of the uploader it does, static code analysis to find the shit and then refactor is much faster [14:04] I already fixed a few bugs in a matter of minutes. [14:04] (I like what zypper did here: http://sprunge.us/ARIL ) [14:04] omf_: right [14:04] noted [14:04] :P [14:06] uhm, omf_ [14:06] http://owely.com/310yuPB [14:06] I find this to be somewhat ironic.. [14:06] See it comes down to we as humans make mistakes. Over and over again. These tools find the common mistakes quickly and allow fast fixes so the little shit doesn't trip you up. Think of it as a spelling and grammar check to a piece of writing. [14:13] thats what computers ARE good at [14:13] it makes me giggle when the devs at work try to solve problems of doing something a lot of times [14:13] and I'm like.......... loops! [14:14] exactly which is why I am always shocked to find coders who do not use static code tools [14:14] This stuff has been around since the 70s and 80s [14:15] lol [14:16] hell it should be a part of development tools by the sound of it [14:16] I have mine tied into git commits [14:16] nothing gets added unless it meets spec [14:16] Nice, thats a good way of forcing yourself instead of "Ah I'll fix it later" [14:17] Minimizing technical debt is the most important part of coding [14:17] * joepie91 has become a walking spellcheck and codecheck... [14:17] I just abstract the fuck out of everything, and I'm a giant pain in the ass about code style [14:17] that generally yields a similar result [14:17] :P [14:18] my serious projects have very very very few bugs [14:18] if any [14:18] so, who wants to buy me a 250 euro scanner? :P [14:33] heh [14:33] i'm just trying to figure out this perl script. [14:33] print "sleeping 5 seconds"; [14:33] sleep 5; [14:33] want to do this between each upload, I think that'll stop the backing off every time for 100 seconds if we just wait [14:34] Hhhaaaaaa it works \o./ [14:35] my sleep isn't quite in teh right place, but it slows it down so that each uploads succeeds. [14:36] my version of the script is in my ~ on anarchive omf_ if you wanna check [14:36] just search for "sleep" ;D [14:36] crap, seems to fail on larger warc's still [14:43] what... is anarchive exactly? [14:45] it's a dedi that GLaDOS has which me and omf_ are abusing the hell outta. [14:45] I have 400ish gb of warc's on there atm [14:45] from IGN/Gamespy grabs [14:45] aha :P [14:45] well down to 92gb now [14:46] I'm waiting for my srsvps to come back up [14:46] it has a few warcs on it as well [14:46] I seem to have broken something [14:46] lol oh dear. [14:46] billing panel says I've used 141/111100GB [14:46] er [14:46] meh I don't know perl :( [14:46] xD [14:46] 141/100GB * [14:46] so that's probably bad [14:46] yes, that sounds pretty bad ;D [14:46] (disk space, not traffic) [14:47] I sorta have a suspicion that openvz kinda broke [14:47] :P [14:47] and now can't boot as a result [14:48] http://archive.org/search.php?query=uploader%3A%22djsmiley2k%40gmail.com%22&sort=-publicdate started appearing now :) [14:48] yup I guessed. [14:49] hm, perhaps I should've sent in a support ticket earlier [14:50] Smiley, you know that your keywords for http://archive.org/details/wikemacs.org-20130129 are broken, right? [14:50] you seem to have forgotten the commas :) [14:50] yeah thats a old one :P [14:50] hey look, Perl is done installing 346134612346 libraries [14:50] perhaps I can run ias3upload now [14:50] lol [14:51] ah, upload is faster now [14:51] maybe sshfs botched my speed [14:51] lol [14:51] roughly 2MB/sec now [14:52] hmmm [14:52] nice [14:52] * Smiley doesn't like the idea he has at least 65 more metadata.csv :P [14:53] uh... this is probably not good [14:53] ffff [14:53] seems like both me and ias3upload fucked up [14:53] I forgot to remove the Test- prefix for the item name, and ias3upload uploaded it to community books instead of audio [14:54] er, community [14:54] texts [14:54] o [14:54] sec [14:55] omf_: halp [14:55] what collection does my audio need to be in? [14:57] opensource_audio ? [14:58] * joepie91 runs around flailing his arms [15:00] Smiley, GLaDOS, any idea? [15:05] find ./ -maxdepth "1" -type d -exec cd {} && /home/Smiley/ia3upload.pl -n --no-derive \; [15:05] joepie91: no idea. [15:05] halp ._. [15:07] I'll just throw it in opensource_audio... [15:11] mm... could someone remove http://archive.org/details/Test-technoterra-Live_the_blend_technoterra_01_10_2010-01Oct2010 ? SketchCow, perhaps? [15:11] it both has an incorrect identifier and is in the wrong collection [15:11] and it's in the upload queue with correct data now [15:11] :p [15:12] for x in ${PWD}/*; do [[ -d "$x" ]] && cd "$x" && /home/Smiley/ia3upload.pl -n --no-derive; done [15:12] joepie91: ask once in #archiveteam [15:41] http://archive.org/search.php?query=subject%3A%22AfterHoursDJs.org%22 [15:41] this is going well! [15:41] cc Smiley [16:00] GLaDOS: I want to completely change the current text in the nwnet pad- is there a way to force a new version, or should I just select all the current text and paste my new text there? [16:24] yay, found a CD-R with 700MB of older afterhoursdjs shows! [16:24] 2004-2005 [16:36] yay! [16:45] joepie91: nice job [16:48] dashcloud: second way. [16:48] okay- thanks! [16:48] also whisky barrels, and even half barrels, are heavy as hell [16:48] Do't ask how I know how much hell weighs [16:54] does a Python library for talking to the IA API exist? [18:13] Heh, looks like I got rate-limited to nothing from Posterous in no time. Does the limiting expire? [18:23] would you look at this perfectly neutral and unbiased web hosting recommendation from Wordpress [18:23] http://get.wp.com/hosting/ [18:23] oh wait, I lied, they're all affiliate links to mediocre hosts [18:24] ;) [18:56] Initial status (read from logfile) [18:56] Current status [18:56] rescued: 724764 kB, errsize: 681 kB, errors: 210 [18:56] rescued: 724780 kB, errsize: 666 kB, current rate: 0 B/s [18:56] my polishing is paying off! [18:57] (doing a rerun of the disc after polishing it with a magic eraser :D) [20:17] SketchCow: Another collection for computertechvideos: https://archive.org/search.php?query=download%20discoverly [20:42] uhhh [20:42] 500 Can't use an undefined value as a SCALAR reference [20:42] Can't use an undefined value as a SCALAR reference at /usr/lib/perl5/vendor_perl/5.16.2/LWP/Protocol/http.pm line 353. [20:42] some of my uploads are failing [20:42] :| [20:43] probably got a special character somewhere [20:45] DFJustin, http://sebsauvage.net/paste/?883a9a3ce36bacd7#wiybaUeupDkrb40uutSIHGCReFYMdrwAEiBNvO9Yzfs= [20:45] that's all in my terminal [20:45] I'm using a CSV with multiple files per item [20:49] DFJustin: so... the exact same happened for another item [20:51] no clue [20:54] btw here is what IA links to for language codes http://www.loc.gov/marc/languages/language_name.html [20:54] under flemish it says "use dutch" [20:54] right [20:55] which, even if it's officially correct, is practically non-sense [20:55] :P [20:55] so blame the library of congress I guess :D [20:55] well hey, which country was it again that was made fun of for not knowing anything about what happened in other countries? :D