#archiveteam-bs 2013-11-10,Sun

↑back Search

Time Nickname Message
00:25 πŸ”— touya norbert79 is everywhere hmm
00:46 πŸ”— BlueMax he's watching you...
00:46 πŸ”— BlueMax *queue X-Files music*
01:13 πŸ”— godane i'm grabing a complete Dilbert Comic Strips Collection
04:13 πŸ”— kyan Hello. I've gotten a very rare recording of a piece of music (unreleased, by a terminally ill composer). Does anyone know if it's possible to request it be added preemptively darked to IA?
04:43 πŸ”— Lord_Nigh kyan: talk to SketchCow and undersco2
04:45 πŸ”— odie5533 kyan: So that it be kept for 90 years before being released?
04:46 πŸ”— kyan Lord_Nigh, ok, I'll email SketchCow, got his email here. Thanks. :)
04:47 πŸ”— kyan odie5533, blame the lobbyists, not the archiver ;)
04:47 πŸ”— Lord_Nigh i thought it was 100 not 90
04:48 πŸ”— Lord_Nigh and the lobbyists are gearing up to push it to 120 in 2018
04:48 πŸ”— odie5533 http://copyright.cornell.edu/resources/publicdomain.cfm
04:49 πŸ”— odie5533 looks like 70 years after death
04:51 πŸ”— kyan huh, the length of that page says something about how much too complicated things areҀ¦
04:52 πŸ”— odie5533 also, that page only deals with U.S. based copyrights in the U.S.
04:52 πŸ”— odie5533 it's much more complicated when you consider non-U.S. works.
04:53 πŸ”— odie5533 here's a page that deals with international copyright law: https://commons.wikimedia.org/wiki/Commons:Copyright_rules_by_territory
04:53 πŸ”— * kyan thinks the whole economic system needs an overhaul
04:53 πŸ”— kyan Oh, maaaan.
04:53 πŸ”— kyan Definitely too confusing.
04:54 πŸ”— yipdw_ kyan: just submit it
04:54 πŸ”— yipdw_ if it needs to be darked, IA will dark it
04:54 πŸ”— kyan already wrote to SketchCow :)
04:54 πŸ”— yipdw_ forgiveness over permission, etc.
04:54 πŸ”— yipdw_ copyright law is too fucked to bother with
04:54 πŸ”— kyan I don't want ME potentially liable over it!
04:54 πŸ”— yipdw_ I'm not sure how you could be
04:55 πŸ”— kyan Oh well. Time for me to go to bed nowҀ¦ overthrowing the global economic system will have to wait for tomorrow
04:55 πŸ”— yipdw_ well, no, I have SOME idea how you could be, but I wouldn't bother worrying about it
04:56 πŸ”— kyan Oh well, I'm paranoidҀ¦
04:56 πŸ”— kyan :-S
05:32 πŸ”— phillipsj In the frozen north it is still 50 years.
05:53 πŸ”— SketchCow Glad I could help.
08:22 πŸ”— DFJustin https://boards.4chan.org/v/res/217458291
08:32 πŸ”— joepie93 :P
08:32 πŸ”— joepie93 DFJustin: not very impressive
08:33 πŸ”— joepie93 okay, the lasagna heatsink is a bit bad
08:33 πŸ”— joepie93 lol
08:34 πŸ”— yipdw_ https://images.4chan.org/v/src/1384065345956.jpg <-- goddamn
08:35 πŸ”— Lord_Nigh DFJustin: oh damn, theres some great stuff there
08:35 πŸ”— Lord_Nigh the heatsink fan that said 'fuck this, i'm outta here'
08:35 πŸ”— Lord_Nigh dead rat in the pc case
08:37 πŸ”— Lord_Nigh the abrasive grinding wheel in the cd drive gave me a fit of giggles
10:01 πŸ”— Lord_Nigh not in immediate danger of disappearing, but ftp.bpmmicro.com has a ton of stuff on it for bp/bpmmicro products dating back to the 1990s
10:02 πŸ”— Lord_Nigh I was working on a mirror of it and have downloaded about 38.9gb of data so far, not done yet
10:02 πŸ”— Lord_Nigh mostly stuff from the past 5 years
10:24 πŸ”— joepie93 Lord_Nigh: how big is it in total?
10:25 πŸ”— Lord_Nigh no idea
10:25 πŸ”— Lord_Nigh still downloading
10:25 πŸ”— Lord_Nigh but i'm not able to create proper warc dumps here
10:26 πŸ”— Lord_Nigh sicne warc patch hasn't been merged into wget in cygwin yet
10:26 πŸ”— joepie93 Lord_Nigh: do you have a rough estimate? 1GB, 10GB, 100GB, 1TB, 10TB, ...
10:26 πŸ”— DFJustin ftp in warc doesn't seem to be a common thing anyway
10:26 πŸ”— joepie93 that kind of estimate
10:26 πŸ”— joepie93 just want to know if I'll be able to mirror it from my OVH box
10:27 πŸ”— DFJustin for ftps we've just been uploading file zips/tars
10:29 πŸ”— odie5533 Lord_Nigh: windows wget binary of 1.14: http://kemovitra.blogspot.ca/2013/02/download-wget-for-32-bit-and-64-bit.html
10:40 πŸ”— Lord_Nigh joepie93: well its ~40gb so far
10:40 πŸ”— Lord_Nigh dunno how much more is left
11:05 πŸ”— joepie93 Lord_Nigh: any rough estimate? like, how many of how many folders have you had so far?
11:05 πŸ”— Lord_Nigh root folders?
11:06 πŸ”— Lord_Nigh theres 3 root folders, dnload, mktg and upload
11:06 πŸ”— Lord_Nigh dnload i have mirrored, mktg is downloading, and i didn't get to upload
11:08 πŸ”— Lord_Nigh i'm up to mktg/video/flashstream, havent done helix, media_transfer or anything else past 'f' in the alphabet in there
11:10 πŸ”— joepie93 I see
11:10 πŸ”— joepie93 thank you :)
11:10 πŸ”— joepie93 I'd wager a very rough guess at around 100-200GB then
11:11 πŸ”— joepie93 which means that after cleaning up I should be able to mirror on my OVH box
11:11 πŸ”— joepie93 but I'll need to process the isohunt stuff first
11:11 πŸ”— joepie93 because it's some 250GB
11:16 πŸ”— M1das_ joepie93: is it possible to first create a -mirror and afterwards a warc file?
11:17 πŸ”— joepie93 M1das_: how do you mean?
11:18 πŸ”— M1das_ like i said, first mirror the data and afterwards compressing it to a warc file
11:19 πŸ”— M1das_ because i dont think wget is very usefull for ftp, as it's just folders
11:19 πŸ”— M1das_ there is no linking being done
11:19 πŸ”— joepie93 M1das_: WARC isn't a compression format
11:19 πŸ”— joepie93 it's an archive format
11:20 πŸ”— joepie93 (there is a difference)
11:20 πŸ”— joepie93 compare WARC to tar, not to gzip
11:20 πŸ”— joepie93 WARC basically just concatenates all requests and responses including all headers into one large file, which can be compressed afterwards (hence wget writing .warc.gz files)
11:20 πŸ”— M1das_ k, so that isnt a possible solution then
11:21 πŸ”— joepie93 that also means that a WARC is created from the raw request/response data, thus you cannot recreate it after the fact
11:21 πŸ”— joepie93 from just a bunch of files
11:21 πŸ”— joepie93 seeing as all headers will have been discarded by then
11:21 πŸ”— joepie93 M1das_: to better understand WARC, do something like something.warc.gz | gzip -cd > something_unpacked
11:21 πŸ”— joepie93 and open it in a text editor
11:21 πŸ”— joepie93 or a hex editor, but WARC itself is a text format
11:22 πŸ”— M1das_ okay
11:22 πŸ”— joepie93 would recommend doing it on a small WARC like from archivebot
11:22 πŸ”— joepie93 you probably don't want to load 5GB into your text editor, unless you're using textpad :)
11:22 πŸ”— joepie93 or less or something
11:22 πŸ”— M1das_ heh, yeah not really usefull
11:23 πŸ”— joepie93 (textpad is the only GUI text editor I am aware of that can comfortably load and display multi-gigabyte text files without problems)
11:23 πŸ”— joepie93 (it's kinda impressive really)
12:08 πŸ”— antomatic am I allowed to run this yet? :)
12:08 πŸ”— antomatic oops, wrong channel
12:09 πŸ”— Tomcat_ joepie93: From a coder's perspective, I believe it's not so impressive (not that I've done it)
14:00 πŸ”— w0rp I have seen Vim struggle with large files.
14:06 πŸ”— M1das_ well that did it, everybody is scared of vim for some reason
14:10 πŸ”— joepie93 haha
14:24 πŸ”— w0rp It is Satan's text editor.
14:24 πŸ”— w0rp VI VI VI
14:29 πŸ”— M1das_ lol w0rp :P
14:53 πŸ”— w0rp I need to get or write a script which downloads entire YouTube channels with resume and downloading of metadata like titles and descriptions.
15:02 πŸ”— joepie93 w0rp: that'd be youtube-dl
15:02 πŸ”— joepie93 :)
15:24 πŸ”— norbert79 touya: Yes, I am :)
15:29 πŸ”— joepie93 TIL: Netscape 6 supports border-radius
18:25 πŸ”— phillipsj I never got around to building an Emacs OS... with a decent text editor (VIM)
18:35 πŸ”— ersi The raging fury assosiation of freetards that like to battle their choise of editors called and invited you
18:35 πŸ”— ersi (Editor choice wars are so incredibly boring)
18:36 πŸ”— joepie91 ANNOUNCEMENT: Join #angerthehyve for the new Hyves archiving project! Works with the Warrior! [/advertising-voice]
18:52 πŸ”— w0rp I need to get a computer case which is better designed for getting disks in and out. It took me something like an hour to replace a hard disk, configuration included.
18:53 πŸ”— w0rp Part of the problem being that my GPU is now so huge it barely fits in the case.
19:00 πŸ”— M1das i have the perfect case for you
19:00 πŸ”— M1das easy access to all your drives
19:00 πŸ”— M1das http://hacktheplanet.nl/img/srv06.jpg
19:09 πŸ”— w0rp That's a pretty serious case.
19:41 πŸ”— godane this may burn eyes: https://scontent-a-lga.xx.fbcdn.net/hphotos-prn2/p480x480/1375730_499611060137001_1516016417_n.jpg
19:50 πŸ”— w0rp I have seen that image too many times.
20:00 πŸ”— odie5533 I can't tell if it's weirder that you've seen it before, or weirder that I've never seen it before.
20:03 πŸ”— w0rp I've looked at too much 4chan, so I've seen most of the semi-popular weird shit.
20:16 πŸ”— ivan` "* We generated blog archives for every Xangan who has signed into the site in the past 5 years, as long as they have more than two subscribers (to rule out spam)." http://xanga.com/
20:27 πŸ”— joepie91 ivan`: only accessible to the users themselves
21:14 πŸ”— Lord_Nigh after this: http://www.gluster.org/2013/08/how-far-the-once-mighty-sourceforge-has-fallen/ i'm wondering whether sf.net is gonna die soon...
21:25 πŸ”— M1das Lord_Nigh: well, after the fuckup with the SVN migration most already left it..
21:26 πŸ”— Lord_Nigh i noticed.
21:26 πŸ”— Lord_Nigh worth AT trying to save what's left?
21:29 πŸ”— dashcloud Lord_Nigh: I think you were the one who asked about VHS copies of Ren & Stimpy episodes right?
21:31 πŸ”— joepie91 Lord_Nigh: of course!
21:33 πŸ”— Lord_Nigh yes i did
21:34 πŸ”— Lord_Nigh the first few airings of each episode were supposedly uncut, then the censors started deleting random parts, and even more got cut when commercial breaks got longer
21:34 πŸ”— deathy sf.. ah..the stupid installer thing... if you're careful you can still get correct/no-crapware versions. and depends on the project
21:34 πŸ”— Lord_Nigh yes, apparently filezilla does have a spyware-free link but its buried
21:36 πŸ”— dashcloud so, I do have a couple of episodes- they're part of the two Snick tapes I have (4 shows on one tape)
21:38 πŸ”— dashcloud there's an episode of four different shows on the tape, roughly what you might see if you were watching Snick live
21:38 πŸ”— antomatic Not sure why Ren & Stimpy needs archiving? The uncut episodes were released on DVD a while ago, no?
21:39 πŸ”— antomatic http://www.amazon.com/dp/B0002NY8XA
21:41 πŸ”— antomatic Huh - maybe not. Still edits on 4 eps.
21:42 πŸ”— antomatic Strange - most of those 'edited' scenes are still aired in here in the UK, I'm certain.
22:11 πŸ”— godane i think Jason Scott should go to Japan at some point
22:12 πŸ”— SketchCow Eventually.
22:12 πŸ”— godane i'm watching a Japanology about the used book trade over there
22:13 πŸ”— godane one guys just collects books with notes in them
22:24 πŸ”— godane wow
22:24 πŸ”— godane you have been working on the arcade doc since 2006
22:26 πŸ”— godane i'm watching the wheel of computer history from Phreaknic X
22:34 πŸ”— SketchCow Less stalking, more talking
22:34 πŸ”— SketchCow I'm being told some of the archiveteam infrastructure is broken, and it's time to fix it.
22:52 πŸ”— dashcloud not necessarily broken, but the creating warrior projects is heavily bottlenecked because there are very few people who seem to be able to create the pipeline.py scripts
22:52 πŸ”— joepie91 well...
22:52 πŸ”— joepie91 urlteam is broken afaik
22:52 πŸ”— joepie91 pipeline/seesaw is nearly undocumented
22:52 πŸ”— joepie91 (and there's a bunch of bugs and things that need to be fixed/changed)
22:53 πŸ”— joepie91 there's also a delay in adding stuff to warrior projects that should probably be looked at (not enough people or not the right people have access to change this?)
22:53 πŸ”— joepie91 and there appears to generally not be enough people with time to write pipeline scripts
22:53 πŸ”— joepie91 that's what I can think of off the top of my head
22:53 πŸ”— joepie91 (I have a list with things to improve/fix somewhere, moment)
22:55 πŸ”— joepie91 http://sprunge.us/TIiX
22:55 πŸ”— joepie91 preliminary notes
22:55 πŸ”— joepie91 collected from user feedback during isohunt-grab
22:55 πŸ”— godane http://www.theverge.com/2013/11/9/5084630/three-20-year-olds-build-their-own-version-of-healthcare-gov
22:57 πŸ”— joepie91 the unicode bug in the tracker is absolutely critical
22:57 πŸ”— joepie91 it is freely (and possibly accidentally) exploitable
22:58 πŸ”— joepie91 one person with a unicode nick and the claims page is dead
22:58 πŸ”— joepie91 entirely
22:58 πŸ”— joepie91 previously, also the entire public tracker dashboard itself, but I believe that was duct-tape
22:58 πŸ”— joepie91 duct-taped *
23:15 πŸ”— w0rp This is why the world needs to move to Python 3, which shouts at you if you try to mix str and bytes. Because Python 2 strings aren't really strings.
23:24 πŸ”— ivan` there's a cool ascii_with_complaints codec for Python 2 that complains about implicit conversions
23:24 πŸ”— ivan` also "the world needs to move to Python 3" is like saying "people should spend tens of millions of dollars against their interests because a new incompatible language of the same name appeared"
23:25 πŸ”— ivan` at least, that's what the people with big Python 2 codebases hear
23:25 πŸ”— w0rp I don't get why Python uses either UCS2 or UCS4 internally. That sucks.
23:25 πŸ”— ivan` you want UCS4 only?
23:26 πŸ”— w0rp Why not UTF-16?
23:27 πŸ”— ivan` internal representations shouldn't really matter, e.g. Python 3.3 uses 8 bits internally if a str fits into ASCII
23:27 πŸ”— ivan` you really just want all the functions to support out-of-BMP codepoints properly (e.g. count them as one codepoint, not two)
23:28 πŸ”— w0rp Yeah, as long as characters are supported. I just assume that on some platforms it breaks without different compilation options.
23:30 πŸ”— ivan` not-so-amusingly when you do this right on the server, the server and client (JavaScript) disagree on the length of things
23:32 πŸ”— ivan` unless you write your own string code, heh http://stackoverflow.com/questions/6885879/javascript-and-string-manipulation-w-utf-16-surrogate-pairs
23:33 πŸ”— w0rp I was just looking at the similar thing.
23:34 πŸ”— w0rp I suppose you could have a sizeForEncoding function in JavaScript.
23:35 πŸ”— odie5533 I'll switch to Python 3 as soon as Twisted releases a Python 3 compatible version of their library.
23:36 πŸ”— w0rp An answer below does show an example of a string that would report a wrong character length in JavaScript.
23:36 πŸ”— joepie91 I'll switch to Python 3 as soon as I can mindlessly install a Python module without worrying about versions
23:36 πŸ”— odie5533 joepie91: Use virtualenv.
23:37 πŸ”— joepie91 odie5533: that doesn't solve the issue
23:37 πŸ”— odie5533 how not?
23:37 πŸ”— joepie91 there are too many things just not vailable for 3
23:37 πŸ”— joepie91 available *
23:37 πŸ”— joepie91 that is what I mean
23:37 πŸ”— odie5533 oh.
23:37 πŸ”— w0rp There's a cool website watching projects for Python 3 support. https://python3wos.appspot.com/
23:37 πŸ”— odie5533 well, yeah. The main thing for me is Twisted at this point.
23:37 πŸ”— w0rp MySQL is another big one.
23:38 πŸ”— odie5533 there's no python 3 connector for mysql?
23:38 πŸ”— w0rp I'm not sure. The main one doesn't work.
23:38 πŸ”— odie5533 w0rp: there are. pymsql for instance works with Python 3.
23:39 πŸ”— w0rp (Which is one of the reasons why I use Postgres at home.)
23:39 πŸ”— odie5533 *mymysql
23:39 πŸ”— odie5533 py... you know what I mean.
23:39 πŸ”— w0rp I can build a Django site at home with Python 3, so I'm happy.
23:40 πŸ”— odie5533 just google python 3 mysql and you get at least 3 different projects out there that connect MySQL and Python 3.
23:41 πŸ”— M1das maybe a good idea to put this all on a collaboration so everyone has access to see what needs to be changed? that way it's possible to streamline it a bit more in the future :)
23:41 πŸ”— odie5533 hmm, the maintainers of Twisted apparently aren't even working on Python 3 support :(
23:42 πŸ”— w0rp I *think* some asynchronous stuff is becoming part of the language itself, so Twisted might be replaced by something else in the far future.
23:42 πŸ”— w0rp I think this even involved collaboration with Twisted people.
23:42 πŸ”— odie5533 w0rp: the new Tulip library doesn't have nearly as many features as Twisted.
23:42 πŸ”— joepie91 odie5533: oursql is what you'll want to use for mysql
23:42 πŸ”— joepie91 which works in py 2 AND 3
23:42 πŸ”— joepie91 and actually does real parameterization
23:42 πŸ”— joepie91 and is production-ready
23:42 πŸ”— ivan` Canonical paid for a lot of Python 3-ization for Twisted, it used to be completely nonexistent
23:43 πŸ”— w0rp Well, nothing new ever does match the previous feature set of something else.
23:43 πŸ”— joepie91 cc w0rp
23:43 πŸ”— odie5533 w0rp: it will include an async reactor, which Twisted can use, but it won't have an SSH, IRC, XMPP, etc. client.
23:43 πŸ”— joepie91 http://pythonhosted.org/oursql/
23:43 πŸ”— odie5533 w0rp: well, I don't think they are even planning to support half the protocols that Twisted does.
23:44 πŸ”— w0rp I can forsee another third party library, or a bunch of libraries, built with Tulip somehow.
23:44 πŸ”— odie5533 I'm surprised Twisted was so low on that pypi python 3 list.
23:45 πŸ”— odie5533 w0rp: probably going to be the bunch of libraries. Which is part of why I like Twisted: it has everything I could possibly need.
23:45 πŸ”— odie5533 well, almost
23:46 πŸ”— w0rp I'm surprised that gunicorn has more downloads than Twisted.
23:47 πŸ”— odie5533 I think perhaps that most people don't get Twisted from PyPi.
23:48 πŸ”— w0rp Windows .msi users?
23:48 πŸ”— odie5533 or rather, that a lot of people don't get it from PyPi, whereas perhaps most people get some of the libraries more from there. I could be completely wrong though.
23:48 πŸ”— odie5533 But personally, I use the Windows binaries from the Twisted site.
23:48 πŸ”— w0rp pip on Windows does kind of suck if you don't have a compiler installed properly.
23:48 πŸ”— odie5533 sucks even if you do..
23:49 πŸ”— w0rp pip on Linux and Mac is pretty great.
23:49 πŸ”— odie5533 You'd need to compile multiple dependencies before anything would work.

irclogger-viewer