[00:01] There's an author who has a habit of pulling his books off of the market a year or two (one >200-page novel after only 2 months)… I have most of them, and am thinking about uploading to IA and requesting it be darked. I'm a little ethically questioning of that course of action, but, I really want to make sure it's saved, so… [00:01] make that less than two months [00:05] why request to be darkened kyan ? [00:06] Smiley, some stay in print longer than others… i wouldn't want to hurt him financially [00:06] also, i think he doesn't want the old ones around for some reason [00:07] hmm fair enough [00:13] just do it [00:14] Archive Team is hardline assholes for a reason [00:14] Do it, make it dark on contact. [00:15] will do [00:15] :) [00:26] i am mirroring a site [00:26] 29/12/2013 [00:26] at 6 kbps [00:28] nearly 32GB [00:28] nico, …and I thought I had it bad, with a Windows-2000-era software FTP archive from Serbia at 80kbps. Blazing in comparison :-S [00:29] is there a reason it's so slow, like it's in a remote place or something? [00:30] the speed is random [00:31] sometime i get a file at 200 bps [00:31] sometime i get it at 500 kps [00:31] kbps [00:31] wow, huh [00:31] crazy [00:31] look like it is in china [00:32] http://pastebin.com/atE0NsN5 [00:32] maybe it's fast sometimes because the communists are running it for the common good, and it's slow sometimes because they're looking for sedition [00:32] look at the loss [00:33] wow, insane [00:33] 20% of loss make tcp unhappy [00:33] the standart deviation is also fun [00:58] if you think you're a bad php dev, look at this http://pastebin.com/1sBCvK2C and get proud of your code :) [01:00] nico: i dont really want to read php. why is it bad [01:04] RedType: global array, violation of layer, hand crafted json, ... [01:05] yeah i clicked it [01:05] i used to be repelled by these things [01:05] but now i just sigh and dont care [01:05] there's only so much horrible poop you can poke with a stick [01:06] god there's so many uncaught edge cases in this too [01:06] ugh [01:07] this code is in a product which license is ~ 400€ [01:08] that's a whole new level of horrible poop [01:09] a lot of people aren't aware that there is a format called JSON [01:09] they just know they have to write out something that JavaScript can parse [01:10] nico: deaguh [01:10] i have written tons of code like [01:10] for rapid prototyping [01:10] i would never expose that to the internet [01:10] hell i wouldnt trust it to hold up for self use considering how rickety that shit is [01:10] it is the third rewrite off the codebase [01:10] s/off/of/ [01:10] strangely it is stable [01:10] DEAUGH [01:11] http://www.youtube.com/watch?v=HsDOkH1pJZY [01:11] what to see the svn repositery [01:11] no not really [01:11] ? :) [01:11] just for fun [01:11] function sql_initdb() { global $dbhdl, $fileDB; $dbhdl = @sqlite_open($fileDB,0660,$sqlerror); [01:11] sql.php [01:12] } [01:12] :) [01:14] ivan`: responseText=eval(xmlhttp.responseText); [01:15] who need JSON and a good parser [01:15] just eval it :) [02:08] $152/4TB HGST http://www.bhphotovideo.com/c/product/835055-REG/Hitachi_0S03359_4TB_Internal_Hard_Drive.html [02:45] kyan: your archivebot node is experiencing errors; #archivebot [02:46] yipdw, thanks for letting me know [02:47] my walbase.cc grab is going up [02:47] its only a brute force grab of jpg images [02:48] for the first 100000 [02:48] only 13977 files exist in this dump [02:48] it is about 6.2gb [03:57] wtf UIKit [03:57] I select rows in a table view then ask for indexPathsOfSelectedRows on it [03:57] you answer nil [03:57] STOP LYING [04:13] good work godane [11:41] yuck, the proxmox web console uses java in your browser :( [12:09] Gotta love the internet I'm using atm. 2 hours 37 minutes to upload a 271mb file :/ [14:19] ohhdemgir: arent you the guy from datahoarders @ reddit? [14:20] midas, yes :) and I've only just realised my name isn't complete here... ohhdemgir < lulwut [14:21] respect for the storage you got. [14:21] cheers :D [14:21] did you reach the 1PB yet or still at 900TB-ish? :p [14:24] still 996TB raw online and you're the first to find out I have another 120TB in drives waiting for a new chassis/card, those will likely go online mid/late feb once I figure out which raid card I want to buy [14:25] geez [14:25] i just ordered 40TB and was feeling like i was the king of all storage [14:25] :D [14:26] any 24 port card you can recommend? (bsd friendly) [14:26] i switched to ZFS, very bsd friendly [14:27] preferably no more than $800 (ish) [14:27] so probably a IT hba and a sas port expander would do? [14:29] any solution cheaper than the last :p [14:29] did ou see the whole thread? [14:30] dont think so, got a link? [14:30] http://redd.it/1q0cjw [14:31] why use a areca + zfs? [14:31] why not use a IBM m1015 + port extender or multiple m1015's? :P [14:32] exactly, it was my first time at the zfs rodeo [14:33] ah, understandable [14:33] learnt a little bit since those builds, worked out okay but a few things need swapping and changing, this time around [14:33] hah, loved the 32 usb drive idea, understand the performance "issue" [14:34] ohh yeah, I've done many random things like that over the years, external seemed like a good idea at the time, not so much anymore xD [14:35] haha yeah, usb bus isnt that great actually. I had the same idea and well, i understand the issue to put it nicely [14:37] something else to account for with the upgrade will be using ecc ram after reading up a load more on zfs! [14:38] yeah, bitflip isnt fun when you have alot of storage [14:38] i just switched my system over to this board: http://www.asrock.com/server/overview.asp?Model=C2550D4I [14:38] 12 drives local and can add another controller to it [14:39] 32GB ram [14:39] ecc [14:39] up to 64GB [14:39] very nice, there was talk of that board in datahoarder not so long ago :) [14:39] uses alot less power, always nice but it's a quadcore and you can get them as a octocore also :) [14:40] i think i saw that one yeah [14:40] current box has a 24port areca, 24 disks, xeon and 6GB ram and 4 years old. time to replace it [14:42] power may not be an issue at all soon, im thinking about hosting my main storage boxes at work as I'm selling my house and moving to a smaller place, though im being really indecisive about it :/ [14:43] * joepie91 waves at ohhdemgir [14:43] here in .nl power is rather explensive :p [14:43] if I do that im for sure going to have to get fiber as all my media will then be a city over [14:43] hey joepie91 that's 3 chans [14:43] ohhdemgir: perhaps you should mirror the Internet Archive :D [14:43] joepie91, how big is it now? [14:44] ohhdemgir: probably some 16 petabytes or so? [14:44] and heh [14:44] likely more channels in commo [14:44] common * [14:45] last i read it was around 11 or so, not that I could seriously consider mirroring xD [14:50] midas, so what do you store? [14:50] how much storage i have? [14:50] what type of data/media do you store? [14:50] i store the internet! [14:51] nah, currently a couple of FTP grabs, about 10TB in size now [14:51] rest of my storage is used for VM's and movies/series [14:51] about 72TB raw online now [14:52] but need to start offloading the FTP stuff for IA [14:52] >ftp grabs.. hmm, vague, could be magical, details? [14:53] ah yes, just a sec [14:53] im currently mirroring ftp.tu-chemnitz.de ftp.uni-muenster.de gatekeeper.dec.com [14:53] ftp.uni-erlangen.de ftp.warwick.ac.uk [14:57] sorry got lost in an xbmc directory in there some where :p [14:57] nice :D [14:58] still need to package these and upload them to IA [15:16] ohhdemgir: if you ever have too much money, hi :) [15:18] xD [15:22] ohhdemgir: http://ascii.textfiles.com/archives/4199 [15:26] >we are in fact considering all FTP sites to be at risk at this point. [15:26] shit :| [15:29] yes, you might want to join in the party ohhdemgir [15:29] "we have cookies" [15:30] already started http://ftp.uni-erlangen.de/ for myself, excluding some of the distro stuff im mirroring from elsewhere [15:31] we have a list somewhere [15:32] someone do ftp://gamefiles.blueyonder.co.uk/ [15:34] Schbirid, started [15:34] \o/ [15:36] ohhdemgir: what kind of bandwidth do you have at home? [15:37] or are you grabbing it at work? [15:37] work [15:37] ah [15:37] home sucks [15:37] hah [15:38] the 5FTP's above that im currently grabbing are doing 10MB/s together, for the last 3 weeks or so [15:38] on and off around that number anyway [15:38] doing a du -sh now, lots of small files so takes ages [15:40] upstream is important too cuz the goal is to get these into the archive.org eventually [15:40] https://archive.org/details/ftpsites [15:40] DFJustin: as soon as i can package them i can start uploading at 100mbit again :) takes forever to grab these big uni sites tho [15:41] yeah as far as I know sketchcow is still grabbing ftp.icm.edu.pl at like 3TB or something [15:42] 3.2T ftp.tu-chemnitz.de [15:42] and that's one of the smaller ones [15:43] EFF ME!! I just did "$sudo halt" in the wrong ssh session ;( [15:43] oops ;-) [15:44] DFJustin, the only reason to do it on a work machine, nice fat 10Gbit pipe :D [15:45] are there any plans how to handle updates? [15:46] so jelly [15:46] at work we have cable internet [15:46] Schbirid, any idea how big gamefiles.blueyonder.co.uk is? I have 3.8TB left to play with on this box i'm mirroring too then I need to move some stuff around [15:47] no idea, i only know the quakeworld movie collection there is ~40G [15:47] hmm, I'l keep an eye on it then as it's coming down fast lol [15:48] yeah, blueyonder is an ISP i think [15:48] yup [15:48] i remember being with them in the late 90's [15:49] kdirstat can index ftp sites iirc [15:50] i had more fun with ncdu (saving the index) and curlftpfs though [15:50] kdirstat was unable to re-read its own saved cache files [15:51] shit, just checked ftp.uni-erlangen.de and I've already grabbed 56GB since midas mentioned it an hour ago [15:52] thats nothing ;-) i've grabbed about 5TB of uni-erlangen ohhdemgir [15:52] haven't mirrored anything so fast in awhile, it's refreshing to no have to wait :D [15:52] and the bitch keeps growing! [15:52] hah [15:53] uni FTP's they tend to be rather fast [16:05] I'm downloading jailbait [16:05] >http://www.imdb.com/title/tt3290276/ [16:05] xD [16:07] why are so many things with jailbait in the title!! - https://www.google.co.uk/search?q=jailbait+imdb [16:15] SketchCow: http://ascii.textfiles.com/archives/4183/comment-page-1#comment-576210 [16:32] joepie91, SketchCow what am I missing, why is the container rented and not owned? [16:33] I'll carry on reading >>Second, I attempted some time ago to purchase the container [16:33] lol [16:35] >and if I did, it would be to empty out the container and order a container from another firm. [16:36] do that, buy one, seems cheaper in the long run. [16:43] midas, how do i upload to archive.org from command line? [16:47] https://pypi.python.org/pypi/internetarchive [16:48] ohhdemgir: you'll want to use either ias3upload (for batch) or the Python internetarchive (with associated commandline client) for single or script automation [16:48] where "batch" is "a CSV full of metadata and filenames" [16:49] https://github.com/kngenie/ias3upload [16:50] I just found a Reverbnation embed code pasted into a Microsoft Works document and uploaded to the Internet Archive. I don't think that's what they were trying to do… [16:54] kyan: careful with the C&C of terrorist cells! [16:58] heh [17:19] i'm starting the upload of my pcgamer russian dvd [17:19] i got it from rutracker too [17:20] there is only 2007-09 so i don't think i will be getting more of these [17:30] soo.... where can one download that later? [17:30] https://archive.org/details/coverdiscs [17:30] ok [17:37] DFJustin, this is why i don't idle here often, my monthly downloads go from 40TB on average to I NEED MORE DRIVES!! [17:38] link after link of "ohh I dont have this" XD [17:38] the coverdiscs thing is new and still being cleaned up, a lot of it is still in the shareware cd collection [17:39] I see, I usually wait until I know an archive is as complete as it's going to get to save actively updating [17:41] Fixed that for them. :D https://archive.org/details/colorblind-bh-31jany2014 [17:42] also how the piss do you download 40tb in a month [17:43] portable 4TB itx box, fill it at work daily, do nightly dumps to main storage, repeat [17:43] some times as little as 15tb/mon depends what I have in my list of things to grab [17:43] but like do you just have an rss feed of the piratebay going into a script or what [17:44] I feel like picking the stuff would be the bottleneck [17:45] the bulk of it is siterips or medical/educational texts, some months I go on a bluray spree though [17:45] or game packs, console archives [17:47] you can probably fit all the consoles into one or at most two 4tb loads though [17:48] yeah I'm done with consoles now, have everything, just the new stuff now, ps3 being the biggest [17:48] I guess ps3 bloats it up mind [17:48] yup [17:50] I basically download either things I specifically want, or packs that are of a sane size (<100gb) [17:51] I'm a hoarder, hence my chan and love of the sub, but even then there aren't many people like me i guess [17:51] I'm a hoarder, with a whopping… 2tb [17:51] umm [17:52] although I do have a 218GB siterip of yande.re that's fucking unseeded at 99.8% [17:52] I think I drew the line once at some unpublished nasa research i couldn't make head nor tale of [17:52] but I was going to send that onwards to archive.org and then delete [17:52] >yande.re ahh fuckk you've done it again, link? [17:52] or should I rerip? [17:53] http://thepiratebay.se/torrent/6179006/oreno.imouto.org_full_siterip_14.02.2011 [17:56] it needs reripping anyway [17:57] >14.02.2011 [17:57] yeah [17:57] I'll do that later tonight [17:57] >_> [18:00] both have 0 distributors? anyone that have the files for the 2 and 4 persons waiting? [18:03] it's zipped up so only the original person would have the exact files [18:04] oh [18:05] I wonder why that person put two torrents up, with the same files... same day two hours apart? [18:07] obviously they're a fuckup [18:07] I guess if it is identical files, the files from one of them might fill in the holes in the other [18:08] yeah hehe [18:08] tried that already [18:08] ok [18:08] in cases like this I suspect a flaky HDD causing checksums not to match [18:08] aha [18:10] ohhdemgir: if you're re-ripping it note that some images have two versions for download, high-res jpg and png, the ripper of this one only got the jpg versions [18:11] hmm, if you want to save me the trouble of fingerprinting later and because you know the site more than I, write me a oneliner to run? [18:11] about fixing broken torrents, something I would love is a search tool that finds files (possibly with other filenames) that one torrent needs in other torrents.... I did that manually (getting missing parts from other torrent and renaming them so the torrent was complete again), but something automatic would be fun [18:12] to be able to create a zip or rar that matches the original would be wonderfull too. to make "windows made" archives despite I use linux etc [18:12] (when I have the files I mean) [18:12] I think there are too many zip programs with different settings [18:14] can't be too many to automate? I mean so you can set the program to make the zip in all know ways, until it matches the torrents checksum. ok it might take some hours but would be ok [18:15] it could have the most popular zip programs "ways", and for rar it would be less variants [18:17] and sometimes one already have the beginning of the zip, just needs to fill in the end with files. [18:44] Askin' again - anyone used http://www.offtherecord-online.com/ or have other advice for decent ways to digitize a record w/o a turntable? [18:44] Looking at buying an older 12" single that didn't come out on CD, don't own a turntable [18:47] DFJustin, writing one? or? [19:06] can't right now, probably won't later [19:07] I don't actually do that kind of thing myself very often [19:08] it's running danbooru so I'm sure there's a script out there [19:09] e.g. https://archive.org/download/akibakko-siterip/akisucker.py was for a similar site [19:31] okies, I'll get to it in the early hours then, doing a bunch of stuff at once and getting ranted at one reddit for something I'm not quite following (bad english) [20:16] so [20:16] I made a little list of the servers I currently have [20:16] https://vpsboard.com/topic/1022-down-to-4-vps/#entry51534 [20:16] cc ohhdemgir [20:16] lol [20:17] ... when you said vps hoarding... [20:17] holy shit xD [20:25] ohhdemgir: well I'm actually not hoarding VPSes [20:25] joepei91 I was reading your VPSboard post, and tried out books.cryto.net… when I typed in a test author, no results, but when I typed in the letter "a" to see what would happen the exact book I'd had in mind for the first search was the 5th result in the list. Have you developed a search engine with ESP?! XD [20:25] (most people in that channel are) [20:25] ohhdemgir: I have an intentionally spread out pile of servers [20:25] you'll notice a lot of variation in location [20:25] and if you look into the datacenters, you'll notice that even VPSes in the same physical city/state/country are in different datacenters [20:25] for what reason (i can think of many..) [20:26] it's mostly just a giant grid of servers that are redundant on every level - locality-wise, facility-wise, and provider-wise [20:26] kyan: hehe [20:26] ohhdemgir: redundancy :) [20:26] madness [20:26] ohhdemgir: moment, let me screenshot my tahoe-lafs grid [20:26] or well, part of it anyway [20:27] "madness" says the guy with 1pb IN HIS HOUSE [20:27] 662,167,252 35.8MB/s eta 13m 3s [20:27] DFJustin, okay okay :p [20:27] ohhdemgir, my current tahoe-lafs grid: http://awesomescreenshot.com/0602acx788 [20:28] still have to set up a few nodes I'm pretty sure [20:28] hehe [20:28] anyway, ohhdemgir, I'm working on a distributed services grid, somewhat resembling the model of AWS, except free for non-profit projects [20:28] storage is one part of that, but I'll also be implementing a pile of other things [20:29] monitoring, bulk email, scraping, whatever [20:29] so redundancy is more or less a must :) [20:29] joepie91, I'm impressed :D [20:30] SketchCow, what you getting upto [20:31] joepie91, wow that's quite a mission [20:31] very cool project [20:31] 25gb [20:32] 25gb in 10 minutes, according to my downloader. [20:34] today was my 30th birthday. [20:34] https://plus.google.com/u/0/photos/107105551313411539773/albums/5975098971348711681/5975098971078835330 happened... [20:34] happy 30th Smiley [20:35] ouch... [20:35] lol thanks :D [20:35] i went to a knife fight, and the knife won. [20:35] xD [20:36] don't bring a finger to a knife fight [20:36] ^^ [20:37] fucking ISP [20:37] how hard is it to do TCP connections right [20:37] * joepie91 reads online logs of channel to see what he missed [20:38] kyan, ohhdemgir, it's quite a large project :P [20:38] I'm sadly not even close to being done [20:38] biggest priority right now is an S3-like API for the storage grid [20:38] second biggest priority would be a CDN frontend for it (probably just using GeoDNS, setting up an anycast network is going to be too expensive for now due to IP allocations and such) [20:39] which involves a few nodes hooked up to the tahoe-lafs grid, but cacheing 500+GB of data locally [20:39] that should in theory provide good performance, despite the latency issues in tahoe-lafs [20:41] after that, I'll continue working on Nexus, my message routing project [20:41] which will be the backbone for all other services [20:41] and task distribution, as well as (in the long run) my monitoring system, which can fairly easily be tunneled over Nexus [20:41] lol [20:41] so basically, infinite todo list [20:53] alright then [20:53] time to sleep [20:53] goodnight all :) [21:08] baiiii [21:13] Schbirid, > Downloaded: 5667 files, 148G in 4h 24m 3s (9.54 MB/s) [21:14] that's all of ftp://gamefiles.blueyonder.co.uk/blueyondergames/blueyondergames/ (without /movies, I'l do that next) [21:24] scrap that, that's with the /movies im dumb [21:33] hey guys it's all better now http://www.politico.com/story/2014/01/tsa-isnt-looking-at-you-naked-anymore-102945.html [21:34] that was my fetish [22:12] DFJustin, high compression zip or just store for uploading to ia? [22:30] both are supported so whichever is more convenient for you [22:31] going with torrent upload too, but not understanding this "just create the item, make a torrent with your files in it, name it like the item and uplod it to the item;" [22:31] i know how to make torrents, but where do i upload it... [22:31] torrent upload "works" but doesn't work nicely with all aspects of the system [22:32] it's more of a workaround if you have shitty upstream that drops all the time [22:33] opted for torrent as i was going to use ias3upload but can't grasp metadata.csv, never felt so daft in all my life... [22:34] ias3upload is for if you have hundreds of similar things to go up into separate items, like ebooks one by one [22:34] the python internetarchive thing is friendlier I think [22:34] It's superior. [22:34] personally I would upload an item or two using the web interface first to get an idea of how it works [22:35] All our effort is going into it and I have been relentless in the testing. [22:35] And yes, web interface is good for trying it out and learning. [22:38] internet archive drive [22:38] it's like Google Drive but better [22:39] to back up a step, the basic organization of the internet archive is a thing called an "item". each item has associated metadata (title, description, tags, ...) and can have one or more files in it [22:39] an item can belong to one or more "collections", e.g. the shareware cd archive [22:40] there are a handful of collections which are open access for all users and have names like "community" or "open source", most are restricted to people with appropriate privileges [22:40] It's been forever, but I'm looking at youtube-dl again so I can try and make sense of what it actually does and rewrite it into something that explains what the Hell is going on. [22:41] if you go to https://archive.org/upload/ and upload something smallish like a pdf it will walk you through the process [22:42] done with test.txt, just looking around options and such :3 [22:44] once you've uploaded something, going to edit item and change metadata shows the standard metadata fields [22:45] I'm there lol [22:51] dazed and confused, flustered even, damn. [22:53] generally we tell people to upload site archives to the opensource ("Community Texts") or opensource_media ("Community Media") collections and then ask for it to be moved [23:37] happy birthday, smiley! [23:38] w0rp: also check out get_flash_videos [23:41] Eww, Perl. [23:49] :D