[00:10] and now g's! [00:19] .t https://gist.github.com/sarciszewski/41dff863601ea7f45d51 [00:19] * joepie91 stares [00:19] botpie91: ping [00:19] what now [00:19] lol [00:20] what the fuck. [00:20] I know that you're running on a box with 128MB RAM, but that's really no excuse [00:20] .t https://gist.github.com/sarciszewski/41dff863601ea7f45d51 [00:20] sleepy bot, happy bot, little ball of code [00:20] wtf [00:21] * joepie91 raises eyebrow [00:21] Will Archive.org let you upload serial numbers with an ISO? [00:21] * tfgbd_ will assume no [00:21] tfgbd_: archive.org takes files with metadata, what files with what metadata that is it does not care about :) [00:21] lol [00:21] (until a DMCA comes in, ofc, but it'll still be archived) [00:21] I was thinking of sending them Corel Draw 7 and 8 [00:22] wtf... [00:22] .title https://gist.github.com/sarciszewski/41dff863601ea7f45d51 [00:22] joepie91: Go Home, WP-API, You're Drunk... [00:22] right then/ [00:22] tfgbd_: https://archive.org/search.php?query=uploader%3A%22admin%40cryto.net%22%20AND%20subject%3A%22disc%20image%22 [00:22] Are they like YouTube and will they give your an account a strike if you give them toomany DMCA takedowns? [00:22] perhaps this alleviates your concerns [00:22] :) [00:22] tfgbd_: nope [00:23] don't forget that the primary goal of IA is archiving, publishing is a way the archiving is made better but not strictly necessary [00:23] so they'll just make unavailable whatever they get abusemail for [00:23] and keep an inaccessible copy in their archive [00:23] nice, some drivers [00:23] I'm working on scanning my CDs now [00:24] I'm doing both sides just to be safe. [00:24] tfgbd_: as in, physically scanning the label? [00:24] Yes [00:24] tfgbd_: I recommend making a scanning sleeve [00:24] massively speeds up, makes for better/cleaner scans [00:24] do go on [00:24] hold on [00:24] I figured I'd scan the back too because there are unique serial numbers on the bottom [00:25] tfgbd_: http://discountoffice.nl/icimages/groot/440605.jpg [00:25] get these, but the 100% transparent ones [00:25] of slightly thicker plastic [00:25] then use a needle and thread to make a square that's missing one side (on the side) [00:25] so you can shove in a CD [00:25] two CDs fit on one sheet [00:26] I got 3 CDs in my scanner now [00:26] shove in two CDs, adjust their orientation a bit, slap onto scanner, scan, take off (without having to pry, because you have a flexible corner to pull off), shake out CDs [00:26] rinse repeat [00:26] Ohh, I get ya [00:26] my label scanning efficiency went through the roof with this method :P [00:26] Yeah, I noticed getting them off the bed was a bit annoying ;P [00:26] tfgbd_: https://ia600601.us.archive.org/17/items/LittleBigAdventure2/cover.png [00:26] I might have a few of those around already [00:26] this shows very well what I mean [00:27] the left side is the opening [00:27] What DPI did you do? [00:27] I was told 600DPI should be plenty. [00:27] good question, probably 600? not sure [00:27] let me check my defaults [00:27] :P [00:27] The guide I followed reccomended 600 as anything over that is overkill [00:28] default is 300DPI: https://github.com/joepie91/scantools/blob/master/scantools/__init__.py#L28 [00:28] Is little big adventure related to Little Big Planet? [00:28] but I think I may have set it to 6000 [00:28] 600 * [00:28] not sure [00:28] nope [00:28] far predates it, no relation at all [00:28] Well, it says the res is 1,603px × 1,574px [00:29] When it says "for Windows 95" does that really just mean it is a DOS game or is there a native Win32 .exe? [00:29] do note that the uploaded cover is postprocessed [00:29] i remember lba on the amegia. [00:29] tfgbd_: hm, I think it was a native .exe as well, but DOS too? not sure [00:29] you'd have to look it up [00:29] Guess I'll download your copy :P [00:29] it was really neat, though [00:29] or that [00:29] lol [00:29] looks interesting [00:29] it's a great game! [00:29] what sort of game is it? [00:29] it's an adventure game [00:29] 3D [00:30] Ah, figured [00:30] lots of those for PC [00:30] like Sam and Max? [00:30] this one is special :) [00:30] no idea, never played that [00:30] hold on [00:30] Does it use one of those VMs like ScummVM? [00:30] sam and max... wasn't that one of those kids games [00:30] nope [00:30] Sam and Max has swearing, so I don't think so [00:30] https://en.wikipedia.org/wiki/Little_Big_Adventure_2 [00:30] oh lol [00:30] but there was a Sam and Max TV show too, I think [00:31] I was looking around the house and felt I should start uploading some of this stuff somewhere [00:31] but yeah [00:31] LBA2 is an amazing game :D [00:31] and yes, yes you should [00:31] :P [00:31] tfgbd_: you running Linux? [00:31] No [00:31] You could have easily just did a /version on me ;P [00:31] The copyrighted stuff I'm starting with Winworld but for the more public domain stuff I guess I'll do archive.org [00:32] eh, /version doesn't usually reveal OS :P [00:32] it should for me [00:32] anyway, was going to send you a bunch of batch archiving scripts if you were on Linux [00:32] er, derp [00:32] I can always spin up a VM [00:33] or use interix or cygwin [00:33] ew, mirc [00:33] doesn't list OS though [00:33] :P [00:33] I see [00:33] could be running it through WINE... [00:33] but yeah [00:33] um [00:33] well [00:33] these are Python scripts [00:33] but Windows doesn't have SANE [00:33] yeah, you're right [00:33] nor does it have the tools that my disc imaging util uses [00:33] what is SANE [00:33] What is it your tool does? [00:33] lets you do one disk after another? [00:34] SANE is basically the de facto standard scanning backend on Linux (it's awesome) [00:34] Ah, I see [00:34] yep [00:34] batch scanning/archiving [00:34] Never used scanning in *nix [00:34] SANE-based scanning tools: https://github.com/joepie91/scantools [00:34] Can't say I've scanned anything in years, really [00:34] disc imaging tools: https://github.com/joepie91/image-disc [00:34] (it automatically invokes the right tool) [00:34] Ohh, for boox [00:34] books* [00:35] well, scantools was originally for books [00:35] I could see that would be useful if I want to archive these manuals I find [00:35] but it scans anything really [00:35] it just saves images sequentially [00:35] No sane for win32 python? [00:35] scans, tells you its done, you flip the page / replace the discs / whatever, hit enter [00:35] goes on to next page [00:35] BTW, why does it seem like all your tools use Python? ;P [00:35] tfgbd_: no clue if there's a win32 SANE, but I doubt it :P [00:35] and well, because Python is one of the languages I use [00:35] usually for batch/system tools [00:36] moved over to Node.js for actual applications/daemons, though [00:36] bit of a friendlier environment for that.. [00:36] Lots of the stuff you guys have made seems to be python, I mean [00:36] not just you :P [00:36] oh, in that sense [00:36] plural you [00:36] :P [00:36] (stupid English, not having a separate word for that..) [00:36] anyway [00:36] that's probably because Python is generally pretty easy to use [00:37] Wow, there seems to be lots of neat software on archive.org now [00:37] and doesn't suck resources like a leech [00:37] (like Ruby does) [00:37] and there's a bunch of libraries for everything [00:37] and it interfaces wiht everything [00:37] etc [00:37] You even got away with uploading Sim Ant [00:37] and it kinda sorta runs crossplatform [00:37] result: Python commonly used :P [00:37] heh, yep [00:37] there's a bunch of sim games on there [00:37] I still have a few on my stack and/or in my post-processing/recovery queue [00:37] Is that your account? [00:37] bought a bunch of them years ago for 2 euro each.. [00:38] ya [00:38] crypto.net? [00:38] cryto* [00:38] (common misparse) [00:39] tfgbd_: if you're looking at my total uploads, you're going to find a lot of automatically uploaded stuff [00:39] :P [00:39] I run a PDF host, and I've hooked it up to automatically mirror every public upload to IA [00:40] How do you do that? [00:40] What do you mean by PDF host? [00:40] tfgbd_: https://pypi.python.org/pypi/internetarchive [00:40] tfgbd_: I mean https://pdf.yt/ [00:40] :P [00:40] Oh, how about tha [00:40] that* [00:40] I think I came across your site in my travels before [00:41] How do you afford the hosting? [00:42] tfgbd_: hosting is cheap, if you're a bit inventive [00:42] "hey let's throw it all on S3" is a very good way to run up bills [00:42] :P [00:42] but there's much cheaper options [00:42] it's currently running on a $9/mo VPS [00:42] I found a few free pseudo-VPSes but I'm on my 3rd one because I was using too much CPU time ;P [00:42] will be moving away storage to a home-made CDN baseed on a Tahoe-LAFS storage grid [00:42] based * [00:42] mostly made up of cheapo VPSes [00:42] I guess it's also free if you host it on your old PC or phone at home [00:42] and spare disk space [00:43] don't bother with free VPSes [00:43] they're almost universally awful [00:43] :P [00:43] well, there are only 2 [00:43] so it's not like I have much choice ;P [00:43] there have been a number of these services in the past [00:43] they all suck, really [00:43] vps.me was okay [00:44] I mean It's not like I blame them for killing my machine after trying to boot Windows XP in QEMU ;P [00:44] IT actually got through the install process and a few reboots, though ;P [00:44] anyway, if you have a few dollars to spend, it's easy to find cheap hosting [00:44] and lol [00:45] I'm way too cheap ;P [00:45] heh [00:45] But regular free hosts seem to be a dime a dozen [00:45] yeah, easier to oversell [00:45] Do you use a cheap VPS? [00:45] anyway, I have a bunch of cheapo monthly VPSes, even a few that are like $15 yearly [00:45] run a ton of stuff on them [00:45] I saw there are a few for even like $10 a year [00:45] heh [00:45] how about 3 euro a year [00:45] Awesome! [00:46] http://lowendspirit.com/index.html [00:46] Is it a real VPS or OpenVZ crap? [00:46] Jails do not a VPS make. [00:46] openvz, and "real VPS" is pretty subjective :P [00:47] well [00:47] lowendspirit is openvzv [00:47] openvz * [00:47] PDFy runs on a KVM [00:47] Aww, that sucks [00:47] All I really want is something emulating a full system so I can install whatever I want [00:47] KVM is nice [00:47] I can even run VMs in KVM [00:48] tfgbd_: http://tinykvm.com/ [00:48] alternatively, https://bluevm.com/ [00:48] both do cheap KVMs [00:48] (TinyKVM is more stable, though) [00:48] I know there was one for like $15 somewhere [00:48] $35 is still a bit much [00:48] that's per year [00:48] $15/yr KVM is doubtful [00:48] I recently found an Intel Xeon machine in a dead guy's trash. I could maybe just use that too [00:49] ... wat [00:49] But it is admittedly neat to have something running you can't hear ;P [00:49] basically every part of that sentence made me go "wat" [00:49] lol [00:49] My dad's friend died and we cleaned out his house [00:49] the computer was just sitting there out front [00:50] hold on, somebody was running a Xeon as a desktop? [00:50] Yeah [00:50] ... why [00:50] It's a 2003 one but still like 3ghz [00:50] According to IBM's website, it cost like $5000 [00:50] This guy did a lot of photo editing so maybe it was for that? [00:50] it doesn't really make sense to run a Xeon unless you're on commercial power :P [00:51] pft, photo editing [00:51] They're that power hungry? [00:51] quite [00:51] they're server CPUs [00:51] Yeah, that's why i thought it was such a nice find [00:51] Maybe he got it cheap himself on a liquidation sale [00:51] idk [00:51] even then [00:51] he probably paid through the nose for power [00:51] lol [00:51] Hmm, maybe I don't want it afterall [00:52] It was wet so I'm letting it dry off [00:52] The hard drives were full of porn [00:52] lol [00:52] for some stupid reason a friend took all the other PCs but didn't want that one for some reason [00:52] Maybe the porn? [00:52] It's freakin' huge and heavy, though [00:56] lol [00:56] who says no to free porn [00:56] lol [00:59] Are there any cheap Hyper-V or VMWare VPses out there? Those are just as good as KVM [00:59] reminds me of this story: http://torrentfreak.com/priests-watch-dvd-screeners-while-pirates-download-filth-in-the-vatican-130407/ [00:59] lowendbox.com [00:59] tfgbd_: don't bother [01:00] you're extremely unlikely to find a better deal than for KVM [01:02] * closure actually bought a cloudatcost.com VM 9 months ago. Despite low expectations, it's still running for $0/month [01:03] kinda figured [01:03] being free and all [01:03] o_o [01:03] although I have had to reinstall it like 3 times because they're pretty shit [01:03] butfr3e?? [01:03] question is, will it hit the 3 years [01:03] lol [01:04] closure: tfgbd_: https://vpsboard.com/topic/2532-cloudatcostcom/ [01:04] closure: Will check em out [01:04] basically, vaguely dodgy [01:04] is it actually like Amazon's free tier? [01:04] company with delusions of grandeur [01:04] Meh, I have debit cards [01:04] if it's free and works, I don't care [01:04] only vaguely dodgy? [01:04] or will they charge you out the ass if you do anything on it? [01:05] closure: :) [01:05] anyway, see that thread [01:05] it has all the details [01:05] maybe if you use up the bandwidth limit they will [01:05] Ehh, $35 is still a bit much if they're dodgy [01:06] yeah, it's a $35 gamble that they'll keep it up as long as makes it a good deal [01:06] still doesn't beat my stylex networks VM [01:07] $1 one-time payment [01:07] for a shitty small Xen VM [01:07] hmmmm ihave a cheap ovh box... [01:07] I think I've used it for 2 years [01:07] wow, still available? [01:07] and they went bankrupt like a month ago [01:07] lol [01:07] but it was a temp offer anyway [01:07] i just use it as a proxy atm.. [01:09] my cloudatcost vm is currently managing to be #10 on the twitpic2 leaderboard, all by itself, so worth it IMHO [01:10] closure: link to leaderboard? [01:10] http://tracker.archiveteam.org/twitpic2/ [01:10] ok, #14 actually [01:11] closure-c? [01:11] da [01:11] * joepie91 SSHs into his OVH box [01:12] closure: how many concurrent? [01:13] joepie91, 2 at most, otherwise you get a bunch of 503s [01:13] fucking wifi [01:13] garyrh: alright [01:14] 20 [01:14] closure: wat [01:15] not banned yet, if it is, I'll bounce the VPS and get a new IP ;) [01:15] closure: you are running with 20 concurrent? [01:15] yes. not many 503's [01:17] tracker rate limiting ._. [01:17] hold on [01:17] wrong script? [01:17] yes, wrong script [01:18] hmm, it may only run 5 concurrent despite being told more [01:19] ha [01:19] your evil plan was foiled [01:19] :D [01:20] there we go [01:20] how much disk does this script need? garyrh [01:21] average size/item is ~15MB, so not too much i suppose [01:27] heh, $0.50/month xen vps with only ipv6 [01:29] okay, what the fuck [01:29] zram just flipped out [01:32] not *entirely* sure what just happened... [01:54] what's up with qwiki disco rate limiting? [01:56] the items are huge and I think the thinking was to keep people from accidentally ddosing the site as the average run time is half a dozen hours, IIRC. [01:57] .. damn [01:57] I'm not getting *any* items, though.. [01:57] maybe arkiver turned it off then? [02:18] tfgbd_: for the items I've uploaded to IA, my cover scans have been 300 dpi PNG files, labelled as a,b,c, etc (a is for front cover, b for back, c for CD- otherwise you can end up with the CD or back cover as your cover image/thumbnail) [02:19] I found this hugely funny, and I'm not a web developer by trade: http://youtube.com/watch?v=b2F-DItXtZs (MongoDB is web scale) [02:22] Oh, so you have to name stuff a certain way? [02:22] They prefer zips too, right [02:23] tfgbd_: don't just zip up the contents of a CD-ROM [02:23] :P [02:23] Noo, I mean for the upload [02:23] it's 3 CDs [02:23] no, don't zip them up [02:23] zipping CDs is just stupid [02:23] just upload the bin/cue pairs [02:23] (they are bin/cue pairs, right?) [02:23] One by one? [02:23] I used ISO for these [02:23] tfgbd_: no? if you use the HTML5 uploader you can select multiple files... [02:24] Does Archive.org let you look inside isos and zips? [02:24] you should always use bin/cue unless you are either 100% certain that it's a single-track data CD-ROM (if you're unsure, then don't) [02:24] or if you are doing recovery with ddrescue [02:24] I kind of figured it would be better if they were compressed since it would save on bandwidth [02:24] I know it's single track. [02:24] It's just a Corel Draw install CD [02:24] it can look inside zips and isos I believe, but you should just upload files in the original format unless you have like over a hundred files in an item [02:24] tfgbd_: have you explicitly checked that it's single-track? [02:25] yes [02:25] Though, it did offer me to make bin/cue for one of the photo CDs for some reason. [02:25] there's probably a reason for that... [02:25] honestly, the best way is to just always make it a bin/cue pair [02:25] if you're talking about CDs [02:25] it can't hurt [02:25] DVDs are fine as ISO, they don't have anything like tracks [02:26] Really? [02:26] yep [02:26] Some certainly do. [02:26] ? [02:26] Home made ones do [02:26] And what about music DVDs? [02:26] tfgbd_: home made what? [02:26] tfgbd_: all DVD formats are data DVDs, just with particular folders [02:26] joepie91: You can pretty easily do a mulisession DVD, I mean [02:26] music DVDs have an AUDIO_TS folder and some metadata [02:26] i figured as much [02:27] I think multisession != multitrack [02:27] ok [02:27] that's my understanding of it anyway [02:27] Can archive.org look inside bin/cue? [02:27] nope [02:27] not yet [02:27] That's why I'd prefer ISO for now [02:27] It's just an install CD and there doesn't seem to be copy protection [02:28] but I did the photo CD as bin/cue just in case because it defaulted to that [02:28] tfgbd_: honestly, archival quality is more important than "can archive.org theoretically browse my file *right now*" [02:28] realistically the "browse CD image" functionality is rarely needed [02:28] AH, I get yet. [02:29] better to just ask IA guys "hey can you make it browse bin/cue as well" or somesuch [02:29] (to which the answer will likely be "sure, eventually, when we get around to it") [02:29] Well, I'll just do another run with bin/cue later. [02:29] Is it possible to do both? [02:29] yes [02:29] I've done that [02:29] in one section? [02:29] yep, and you can add files to your item later on [02:29] section? [02:29] yeap- I upload both sets all at once [02:29] i was going to do bin/cue for betaarchive too [02:30] tfgbd_: what do you mean with 'section'? [02:30] oh [02:30] can you have 2 different formats in one item on IA?? [02:30] like the books [02:30] yes [02:30] well, the books are auto-converted ("derived") [02:30] but yes [02:31] there are no limitations on what files you can put inside an item [02:31] (other than that the virus scanner will sometimes get angry at you) [02:31] Do you have to do it all at once or can you wait like an hour and upload more stuff? [02:31] you can wait and add it later, although I have to say that the item editor is rather... unpleasant to use, compared to the HTML5 uploader [02:31] pretty sure they're about 10 years apart in terms of release date :) [02:34] Isn't there a flash or java uploader? [02:37] Anyway, every disk I did ISO is only 1 track [02:50] tfgbd_: yes, the 'edit item' uploader is a Flash uploader [02:50] and it's fairly awful [02:50] :P [02:51] (as flash uploaders generally are) [02:56] they should offer ftp :P [02:57] tfgbd_: um, I think they do :P [02:57] that said, FTP is blergh [02:57] rsync > FTP [03:03] https://pypi.python.org/pypi/internetarchive is the power user upload method [05:26] https://community.rapid7.com/community/metasploit/blog/2014/10/28/r7-2014-15-gnu-wget-ftp-symlink-arbitrary-filesystem-access [05:30] LOL: "There are 1067 tasks queued to run before yours" [05:30] Poor archive.org [05:38] a. [06:07] Woo, it worked! [06:07] https://archive.org/details/CorelDRAW7Build373RTM [06:07] But how do I change the image thumbnail? [14:26] thumbnail is just the first image file in alphabetical order [14:27] you can rename files to change it [15:59] .. damn midas read! [16:00] trying to upload some ftp's, failing everytime [16:00] forgetting to add collection:derp [16:00] must be eary or something [16:02] have any of the FTP archiving projects gotten ftp://ftp.ti.com/ ? [16:03] dont think we have [16:03] I can run wget against it or similar if someone tells me what to run, I have the bandwidth and likely the space if someone tells me what to run. [16:03] http://archiveteam.org/index.php?title=FTP [16:03] ;-) [16:04] and join #effteepee [16:04] * Jonimus should have known. [16:04] there is always a channel and/or wiki about it [16:08] it is up and running. [16:08] perfect :) [16:09] you can add it here Jonimus http://dat.serveert.me.uk/p/ftp and make sure you place it in TO BE MOVED when done :) [16:11] midas: I just realized I need to make sure I have the space first, so doing that now. [16:11] thats kinda important too yeah [16:11] if its too big dump it on the list and/or shout it in #effteepee [16:12] current dump im uploading will take me 6 hours [16:12] so thats going to be fun [16:14] Jonimus: you also need the upstream bandwidth and/or connection stability to upload 100 gb or whatever at once since you can't really resume [16:14] or use torrents to upload [16:15] that has some drawbacks [16:15] and anything above ~500GB is a good idea to split into smaller files it seems [16:16] erm [16:16] that's off by at least one magnitude [16:16] nah [16:16] 20G is what i have in mind as good item size [16:16] ftp collection has alot bigger ones [16:16] yes but that's not nice for IA [16:17] ftpboneyard isnt really that active for starters or indexed by wayback [16:17] and SketchCow said it was okay :p [16:17] much unfair! [16:17] it's more about their machines hdd partitioning [16:18] the smaller, the nicer [16:18] you might make boxes run out of space if you upload something huge [16:18] at least i had that happen once or twice, ahem =) [16:18] oh yeah that happens [16:18] but you can send the size before uploading [16:19] https://archive.org/details/ftp.compaq.com 200+GB for example :p [16:19] uploading 1TB was bit too much [16:20] damn the new layout sucks for these [16:21] its only 31GB so I should be fine. [16:21] this one was fun: https://archive.org/details/ftp.adobe.com [16:22] felt somewhat like this: https://gs1.wac.edgecastcdn.net/8019B6/data.tumblr.com/11d5614c09b1254d3a4a38e0c5537171/tumblr_na0tizwTtJ1qdh308o1_400.gif [16:24] =) [16:25] now do jamendo! [16:25] (sorry :P) [16:27] hahaha actually, i have that disk standing by in my other server ready to upload [16:31] woot :) [18:58] i'm starting to do my full download of theblaze.com website again [19:00] its going here: http://archive.org/details/www.theblaze.com-2010-stories-pages-20141029 [19:00] i'm doing it by year [19:01] it has to be redownload anyways cause they chanced the url paths [19:01] my first grab back in 2012 didn't have date paths in the urls [19:28] Morning [19:29] What [21:22] hey SketchCow [22:52] I just found about this today, and presumably some of the US folks here may be eligible: http://www.intelpentium4litigation.com/ [23:09] I probably fit those criteria, the grounds seem kind of lame though [23:11] 2000, was that willamette core? [23:12] ah yes [23:12] i should read first and stuf [23:12] +F [23:13] hmm the complaint itself is more damning [23:19] Awake [23:20] welcome back