[00:25] norbert79 is everywhere hmm [00:46] he's watching you... [00:46] *queue X-Files music* [01:13] i'm grabing a complete Dilbert Comic Strips Collection [04:13] Hello. I've gotten a very rare recording of a piece of music (unreleased, by a terminally ill composer). Does anyone know if it's possible to request it be added preemptively darked to IA? [04:43] kyan: talk to SketchCow and undersco2 [04:45] kyan: So that it be kept for 90 years before being released? [04:46] Lord_Nigh, ok, I'll email SketchCow, got his email here. Thanks. :) [04:47] odie5533, blame the lobbyists, not the archiver ;) [04:47] i thought it was 100 not 90 [04:48] and the lobbyists are gearing up to push it to 120 in 2018 [04:48] http://copyright.cornell.edu/resources/publicdomain.cfm [04:49] looks like 70 years after death [04:51] huh, the length of that page says something about how much too complicated things are… [04:52] also, that page only deals with U.S. based copyrights in the U.S. [04:52] it's much more complicated when you consider non-U.S. works. [04:53] here's a page that deals with international copyright law: https://commons.wikimedia.org/wiki/Commons:Copyright_rules_by_territory [04:53] * kyan thinks the whole economic system needs an overhaul [04:53] Oh, maaaan. [04:53] Definitely too confusing. [04:54] kyan: just submit it [04:54] if it needs to be darked, IA will dark it [04:54] already wrote to SketchCow :) [04:54] forgiveness over permission, etc. [04:54] copyright law is too fucked to bother with [04:54] I don't want ME potentially liable over it! [04:54] I'm not sure how you could be [04:55] Oh well. Time for me to go to bed now… overthrowing the global economic system will have to wait for tomorrow [04:55] well, no, I have SOME idea how you could be, but I wouldn't bother worrying about it [04:56] Oh well, I'm paranoid… [04:56] :-S [05:32] In the frozen north it is still 50 years. [05:53] Glad I could help. [08:22] https://boards.4chan.org/v/res/217458291 [08:32] :P [08:32] DFJustin: not very impressive [08:33] okay, the lasagna heatsink is a bit bad [08:33] lol [08:34] https://images.4chan.org/v/src/1384065345956.jpg <-- goddamn [08:35] DFJustin: oh damn, theres some great stuff there [08:35] the heatsink fan that said 'fuck this, i'm outta here' [08:35] dead rat in the pc case [08:37] the abrasive grinding wheel in the cd drive gave me a fit of giggles [10:01] not in immediate danger of disappearing, but ftp.bpmmicro.com has a ton of stuff on it for bp/bpmmicro products dating back to the 1990s [10:02] I was working on a mirror of it and have downloaded about 38.9gb of data so far, not done yet [10:02] mostly stuff from the past 5 years [10:24] Lord_Nigh: how big is it in total? [10:25] no idea [10:25] still downloading [10:25] but i'm not able to create proper warc dumps here [10:26] sicne warc patch hasn't been merged into wget in cygwin yet [10:26] Lord_Nigh: do you have a rough estimate? 1GB, 10GB, 100GB, 1TB, 10TB, ... [10:26] ftp in warc doesn't seem to be a common thing anyway [10:26] that kind of estimate [10:26] just want to know if I'll be able to mirror it from my OVH box [10:27] for ftps we've just been uploading file zips/tars [10:29] Lord_Nigh: windows wget binary of 1.14: http://kemovitra.blogspot.ca/2013/02/download-wget-for-32-bit-and-64-bit.html [10:40] joepie93: well its ~40gb so far [10:40] dunno how much more is left [11:05] Lord_Nigh: any rough estimate? like, how many of how many folders have you had so far? [11:05] root folders? [11:06] theres 3 root folders, dnload, mktg and upload [11:06] dnload i have mirrored, mktg is downloading, and i didn't get to upload [11:08] i'm up to mktg/video/flashstream, havent done helix, media_transfer or anything else past 'f' in the alphabet in there [11:10] I see [11:10] thank you :) [11:10] I'd wager a very rough guess at around 100-200GB then [11:11] which means that after cleaning up I should be able to mirror on my OVH box [11:11] but I'll need to process the isohunt stuff first [11:11] because it's some 250GB [11:16] joepie93: is it possible to first create a -mirror and afterwards a warc file? [11:17] M1das_: how do you mean? [11:18] like i said, first mirror the data and afterwards compressing it to a warc file [11:19] because i dont think wget is very usefull for ftp, as it's just folders [11:19] there is no linking being done [11:19] M1das_: WARC isn't a compression format [11:19] it's an archive format [11:20] (there is a difference) [11:20] compare WARC to tar, not to gzip [11:20] WARC basically just concatenates all requests and responses including all headers into one large file, which can be compressed afterwards (hence wget writing .warc.gz files) [11:20] k, so that isnt a possible solution then [11:21] that also means that a WARC is created from the raw request/response data, thus you cannot recreate it after the fact [11:21] from just a bunch of files [11:21] seeing as all headers will have been discarded by then [11:21] M1das_: to better understand WARC, do something like something.warc.gz | gzip -cd > something_unpacked [11:21] and open it in a text editor [11:21] or a hex editor, but WARC itself is a text format [11:22] okay [11:22] would recommend doing it on a small WARC like from archivebot [11:22] you probably don't want to load 5GB into your text editor, unless you're using textpad :) [11:22] or less or something [11:22] heh, yeah not really usefull [11:23] (textpad is the only GUI text editor I am aware of that can comfortably load and display multi-gigabyte text files without problems) [11:23] (it's kinda impressive really) [12:08] am I allowed to run this yet? :) [12:08] oops, wrong channel [12:09] joepie93: From a coder's perspective, I believe it's not so impressive (not that I've done it) [14:00] I have seen Vim struggle with large files. [14:06] well that did it, everybody is scared of vim for some reason [14:10] haha [14:24] It is Satan's text editor. [14:24] VI VI VI [14:29] lol w0rp :P [14:53] I need to get or write a script which downloads entire YouTube channels with resume and downloading of metadata like titles and descriptions. [15:02] w0rp: that'd be youtube-dl [15:02] :) [15:24] touya: Yes, I am :) [15:29] TIL: Netscape 6 supports border-radius [18:25] I never got around to building an Emacs OS... with a decent text editor (VIM) [18:35] The raging fury assosiation of freetards that like to battle their choise of editors called and invited you [18:35] (Editor choice wars are so incredibly boring) [18:36] ANNOUNCEMENT: Join #angerthehyve for the new Hyves archiving project! Works with the Warrior! [/advertising-voice] [18:52] I need to get a computer case which is better designed for getting disks in and out. It took me something like an hour to replace a hard disk, configuration included. [18:53] Part of the problem being that my GPU is now so huge it barely fits in the case. [19:00] i have the perfect case for you [19:00] easy access to all your drives [19:00] http://hacktheplanet.nl/img/srv06.jpg [19:09] That's a pretty serious case. [19:41] this may burn eyes: https://scontent-a-lga.xx.fbcdn.net/hphotos-prn2/p480x480/1375730_499611060137001_1516016417_n.jpg [19:50] I have seen that image too many times. [20:00] I can't tell if it's weirder that you've seen it before, or weirder that I've never seen it before. [20:03] I've looked at too much 4chan, so I've seen most of the semi-popular weird shit. [20:16] "* We generated blog archives for every Xangan who has signed into the site in the past 5 years, as long as they have more than two subscribers (to rule out spam)." http://xanga.com/ [20:27] ivan`: only accessible to the users themselves [21:14] after this: http://www.gluster.org/2013/08/how-far-the-once-mighty-sourceforge-has-fallen/ i'm wondering whether sf.net is gonna die soon... [21:25] Lord_Nigh: well, after the fuckup with the SVN migration most already left it.. [21:26] i noticed. [21:26] worth AT trying to save what's left? [21:29] Lord_Nigh: I think you were the one who asked about VHS copies of Ren & Stimpy episodes right? [21:31] Lord_Nigh: of course! [21:33] yes i did [21:34] the first few airings of each episode were supposedly uncut, then the censors started deleting random parts, and even more got cut when commercial breaks got longer [21:34] sf.. ah..the stupid installer thing... if you're careful you can still get correct/no-crapware versions. and depends on the project [21:34] yes, apparently filezilla does have a spyware-free link but its buried [21:36] so, I do have a couple of episodes- they're part of the two Snick tapes I have (4 shows on one tape) [21:38] there's an episode of four different shows on the tape, roughly what you might see if you were watching Snick live [21:38] Not sure why Ren & Stimpy needs archiving? The uncut episodes were released on DVD a while ago, no? [21:39] http://www.amazon.com/dp/B0002NY8XA [21:41] Huh - maybe not. Still edits on 4 eps. [21:42] Strange - most of those 'edited' scenes are still aired in here in the UK, I'm certain. [22:11] i think Jason Scott should go to Japan at some point [22:12] Eventually. [22:12] i'm watching a Japanology about the used book trade over there [22:13] one guys just collects books with notes in them [22:24] wow [22:24] you have been working on the arcade doc since 2006 [22:26] i'm watching the wheel of computer history from Phreaknic X [22:34] Less stalking, more talking [22:34] I'm being told some of the archiveteam infrastructure is broken, and it's time to fix it. [22:52] not necessarily broken, but the creating warrior projects is heavily bottlenecked because there are very few people who seem to be able to create the pipeline.py scripts [22:52] well... [22:52] urlteam is broken afaik [22:52] pipeline/seesaw is nearly undocumented [22:52] (and there's a bunch of bugs and things that need to be fixed/changed) [22:53] there's also a delay in adding stuff to warrior projects that should probably be looked at (not enough people or not the right people have access to change this?) [22:53] and there appears to generally not be enough people with time to write pipeline scripts [22:53] that's what I can think of off the top of my head [22:53] (I have a list with things to improve/fix somewhere, moment) [22:55] http://sprunge.us/TIiX [22:55] preliminary notes [22:55] collected from user feedback during isohunt-grab [22:55] http://www.theverge.com/2013/11/9/5084630/three-20-year-olds-build-their-own-version-of-healthcare-gov [22:57] the unicode bug in the tracker is absolutely critical [22:57] it is freely (and possibly accidentally) exploitable [22:58] one person with a unicode nick and the claims page is dead [22:58] entirely [22:58] previously, also the entire public tracker dashboard itself, but I believe that was duct-tape [22:58] duct-taped * [23:15] This is why the world needs to move to Python 3, which shouts at you if you try to mix str and bytes. Because Python 2 strings aren't really strings. [23:24] there's a cool ascii_with_complaints codec for Python 2 that complains about implicit conversions [23:24] also "the world needs to move to Python 3" is like saying "people should spend tens of millions of dollars against their interests because a new incompatible language of the same name appeared" [23:25] at least, that's what the people with big Python 2 codebases hear [23:25] I don't get why Python uses either UCS2 or UCS4 internally. That sucks. [23:25] you want UCS4 only? [23:26] Why not UTF-16? [23:27] internal representations shouldn't really matter, e.g. Python 3.3 uses 8 bits internally if a str fits into ASCII [23:27] you really just want all the functions to support out-of-BMP codepoints properly (e.g. count them as one codepoint, not two) [23:28] Yeah, as long as characters are supported. I just assume that on some platforms it breaks without different compilation options. [23:30] not-so-amusingly when you do this right on the server, the server and client (JavaScript) disagree on the length of things [23:32] unless you write your own string code, heh http://stackoverflow.com/questions/6885879/javascript-and-string-manipulation-w-utf-16-surrogate-pairs [23:33] I was just looking at the similar thing. [23:34] I suppose you could have a sizeForEncoding function in JavaScript. [23:35] I'll switch to Python 3 as soon as Twisted releases a Python 3 compatible version of their library. [23:36] An answer below does show an example of a string that would report a wrong character length in JavaScript. [23:36] I'll switch to Python 3 as soon as I can mindlessly install a Python module without worrying about versions [23:36] joepie91: Use virtualenv. [23:37] odie5533: that doesn't solve the issue [23:37] how not? [23:37] there are too many things just not vailable for 3 [23:37] available * [23:37] that is what I mean [23:37] oh. [23:37] There's a cool website watching projects for Python 3 support. https://python3wos.appspot.com/ [23:37] well, yeah. The main thing for me is Twisted at this point. [23:37] MySQL is another big one. [23:38] there's no python 3 connector for mysql? [23:38] I'm not sure. The main one doesn't work. [23:38] w0rp: there are. pymsql for instance works with Python 3. [23:39] (Which is one of the reasons why I use Postgres at home.) [23:39] *mymysql [23:39] py... you know what I mean. [23:39] I can build a Django site at home with Python 3, so I'm happy. [23:40] just google python 3 mysql and you get at least 3 different projects out there that connect MySQL and Python 3. [23:41] maybe a good idea to put this all on a collaboration so everyone has access to see what needs to be changed? that way it's possible to streamline it a bit more in the future :) [23:41] hmm, the maintainers of Twisted apparently aren't even working on Python 3 support :( [23:42] I *think* some asynchronous stuff is becoming part of the language itself, so Twisted might be replaced by something else in the far future. [23:42] I think this even involved collaboration with Twisted people. [23:42] w0rp: the new Tulip library doesn't have nearly as many features as Twisted. [23:42] odie5533: oursql is what you'll want to use for mysql [23:42] which works in py 2 AND 3 [23:42] and actually does real parameterization [23:42] and is production-ready [23:42] Canonical paid for a lot of Python 3-ization for Twisted, it used to be completely nonexistent [23:43] Well, nothing new ever does match the previous feature set of something else. [23:43] cc w0rp [23:43] w0rp: it will include an async reactor, which Twisted can use, but it won't have an SSH, IRC, XMPP, etc. client. [23:43] http://pythonhosted.org/oursql/ [23:43] w0rp: well, I don't think they are even planning to support half the protocols that Twisted does. [23:44] I can forsee another third party library, or a bunch of libraries, built with Tulip somehow. [23:44] I'm surprised Twisted was so low on that pypi python 3 list. [23:45] w0rp: probably going to be the bunch of libraries. Which is part of why I like Twisted: it has everything I could possibly need. [23:45] well, almost [23:46] I'm surprised that gunicorn has more downloads than Twisted. [23:47] I think perhaps that most people don't get Twisted from PyPi. [23:48] Windows .msi users? [23:48] or rather, that a lot of people don't get it from PyPi, whereas perhaps most people get some of the libraries more from there. I could be completely wrong though. [23:48] But personally, I use the Windows binaries from the Twisted site. [23:48] pip on Windows does kind of suck if you don't have a compiler installed properly. [23:48] sucks even if you do.. [23:49] pip on Linux and Mac is pretty great. [23:49] You'd need to compile multiple dependencies before anything would work.