[03:29] i'm back [03:30] scanned the fronts of cds i got [03:31] and envelope [04:37] What Windows IRC app is good to use with a flaky connection (i.e. auto-reconnect)? [04:42] I find HexChat is alright [04:54] I try to stay away from viewing anything on google+ but had to because it an article comes so highly recommended. I notice google still wastes all kinds of screen real estate pimping every thing else they make [04:56] thanks [06:06] http://www.flickr.com/groups/arcades/pool/with/5673440377/ is rewarding to go through. [06:10] SketchCow: i got my cds today [06:10] i have a 3gb demo of world of war craft [06:14] Excellent. [08:03] "all kinds of screen estate" being a 10px bar at the top? [08:11] https://gist.github.com/anonymous/5856797 [08:11] anyone with access to any mailing lists or people that might be interested in Google Reader, please pass along a message about the ArchiveTeam effort. feel free to use any/all of the message in that github gist [08:18] "Taylor Swift Is The Best Idol" [08:18] Keep thinking that, twitter [08:18] Keep thinking that.. [09:06] :o pouet is still going [09:06] -rw-r--r-- 1 tim.bowers games 158G Jun 25 10:06 ./bin/ign/storage/pouet/pouet.net_06052013.warc [09:11] Why are you grabbing Pouet? Is it owned by ign? :o [09:21] ersi it says ign the line :) [09:27] because it's going somewhere appently o something [09:36] BlueMax: Well, gee - guess where I freggin' got IGN from? Yes, I'm asking because I don't know if that's supposed to be there or not in the path and I did not know IGN owned/operated pouet.net [09:36] :P [09:40] lol [09:40] no, it's nothing to do with IGN [09:40] I just happened to mount the external hdd in the ign dir at the time as I was doing lots of grabs there [09:42] ADD Moment: i found out that underground gamer lost there data [09:42] so the original stuff on there is not coming back [09:42] but a new UG may happen it looks like [09:42] but not with the old admins [09:47] Smiley: That was my hunch and why I asked :-) [09:47] [09:47] * Smiley wibbles [09:48] anyone here use digital ocean? [09:48] sure [09:48] can I turn my droplet into a image? [09:48] I want to run the same droplet multiple times now. [09:48] hmm [09:48] I don't know, I've just spun up a few basic ones. I'd ask in #digitalocean [09:49] can't remember which network, but I guess Freenode [09:49] we have a winner ;) [09:52] \o/ [09:56] Is there a version of wget-warc precompiled for OS X? I friend wants to do a simple scrape save of this site: http://db-in.com/blog/ Can anyone offer me advice? [09:58] You don't need ""wget-warc"". You just need wget. The 1.14 version specifically. That should help you narrow down to find a binary I guess.. [09:58] wget-1.14 has WARC-support built-in [10:01] Thank you. I just read this on the archiveteam site, http://www.archiveteam.org/index.php?title=Software, "Backing up a Wordpress site: "wget --no-parent --no-clobber --html-extension --recursive --convert-links --page-requisites --user= --password= " [10:02] wow, I helped (I put that there yesterday ;)) [10:07] lol [11:28] turnkit: you want WARC if you're ingesting into IA (and archive team uses it for that reason) [11:28] if you just want a readable dump of a site, you don't [11:36] Thanks Smiley :) [11:36] thanks all [12:18] turnkit: you may wish to consider Nettalk also [12:18] (as client) [12:18] it auto-reconnects and auto-rejoins channels [12:18] (if hexchat doesn't work nice enough) [13:07] Smiley, I just loaded google plus again. There is the the list of services 10px high, then the search bar with a button to login, then a whole other row just for a giant button in the middle saying 'Join Google+' then a fucking other fly out menu below that before I even get to content [13:08] and another join google+ at the end of the article [13:08] all this for a blog post that does not need any of that crap [13:09] also as you scroll down you get a sticky menu at the top with the center join google+ button and the flyout menu below so I can enjoy having that crap in my face all the time [13:17] omf_: ah meh [13:17] i'm logged in, i dont get that crap :D [13:21] I have been working to move off of google [13:21] They can have my search traffic but nothing else [13:27] fair enough [16:52] pictures of sploded house in Dordrecht (WTFPL/CC0): http://imgur.com/a/OHNxE [16:57] what happened to it? [16:58] godane: 2.3G www.abandonia.com.warc.gz done, can upload later [16:59] ersi: gas explosion of some sort, details unclear [17:00] gypsys stealing gas? [17:00] but [17:00] the most obvious explanation would be a gas leak [17:01] it happened while the guy was showering [17:01] so it's likely to have been an issue with the boiler [17:01] or whatever the English term for it is [17:01] boiler works [17:03] water heater works as well I guess [17:04] https://plus.google.com/u/0/107105551313411539773/posts/bdmtFBGM3cs [17:04] Thats a stolen gas blown up house [17:04] gas started leaking at loads of random points in the area due to the blast [17:04] my brother in law had the gas board checking inside his house as he could smell gas [17:45] I find this article and the follow up comments which provide some interesting historical insight show why RSS was kinda fucked from the beginning http://www.jwz.org/blog/2013/06/google-reader-apocalypse-extremely-fucking-nigh/ [17:46] The basis for that idea is that the new way of doing things has worse problems than the nntp way [17:54] I just added two more technical q & a to http://pad.archivingyoursh.it/p/faqs what other questions do we see keep popping up [18:06] the overall tone is kind of dismissive [18:06] "Just staring at it are not going to make it go faster." really, man? [18:07] ivan: was suprise it was only 2.3gb [18:29] hey uh, if i edit something on etherpad does it immediately save [18:38] no you have to save it [18:38] it does stay as long as the pad server is up without saving [18:38] okay [18:40] winr4r, People complaining about shit being slow happens dozens of times per project and they always mention how they have been watching it or monitoring it. That said this is pad not a wiki page I am using it to work through ideas to make it better. This is a first draft, not a final draft. [18:41] answer " MOARRRRR PIPES!" [18:41] yes Smiley that is a good answer :) [18:41] I think any good FAQ contains some humour, thats all [18:41] Such as "That isn't a question, but...." [18:42] omf_: i agree that answering the question is a good idea [18:42] insulting the person reading it is not [18:42] and yes, it's a draft, that's why i offered suggestions [18:42] :) [18:42] out of smokes, off to the shop, brb [18:43] I fixed it by adding a ;) [18:43] hmmmm incense..... PC..... incense... PC..... [18:44] Smiley, I added your q/a [18:44] I remember you asking that in relating to traffic shaping before [18:45] oh for virtualbox? [18:51] omf_: mind if i put some links for the last q on you pad? [18:52] do it [18:54] added opera and chrome [18:54] sweet [18:58] now to file wget bug [19:01] woohoo, that was close [19:01] ran all the way there, forgot to take my inhaler with me, came home verrrry slowly [19:03] D: [19:04] 28 of 410GB done XD [19:05] Smiley: crikey [19:06] winr4r: not a upload this time ;) [19:06] I'm moving a copy of one hdd to another one :D [19:06] Smiley: oh! [19:06] i was gonna say [19:06] you gon' be there a while [19:06] I do have a large warc... [19:06] tim.bowers@timDesktop ~ $ ls -lah ./bin/ign/storage/pouet/pouet.net_06052013.* [19:07] -rw-r--r-- 1 tim.bowers games 840M Jun 25 20:06 ./bin/ign/storage/pouet/pouet.net_06052013.cdx [19:07] -rw-r--r-- 1 tim.bowers games 162G Jun 25 20:06 ./bin/ign/storage/pouet/pouet.net_06052013.warc [19:09] i'm glad to live in an era where we can be like "oh, it's 162 gigs" rather than "oh shit, it's 162 gigs" [19:10] :DD [19:10] Nod [19:10] I'm glad I have a job where I can go "2 tb of downloads? hahaha cool I can do that :D" [19:12] Smiley: please tell me you did reject ftp.scene.org :P [19:12] Schbirid: removed them from the list as ftp don't work :P [19:12] wait [19:13] for pouet? no, but i'ts limited on the domain :P [19:13] ah nice [19:13] that's huge :p [19:14] err , ":o" [19:14] 150K .......... ........ 100% 11.2M=0.1s [19:14] 2013-06-25 20:13:59 (1.34 MB/s) - 'pouet.net/prodlist.php?year=2011&reverse=1&order=release&page=747' saved [172949/172949] [19:14] still just doing the domain itself. [19:15] Smiley: dude, i would be fiirrred if i did that at the office [19:15] winr4r: hahaha [19:15] for some absurd reason they have DSL with a 5gb monthly cap [19:15] If they fired me, the company would partly fall apart [19:15] LOL [19:15] FIVE [19:15] 500Mbit over 1Gb carrier. [19:15] Smiley: woohoo! [19:15] Though I'm limited to 100 in the office due to the fact it's 100mbit switch XD [19:16] I could go onto one of the servers and do it, but I haven o way of easily limiting myself in emergancies, so I don't [19:16] It has been running for _weeks_ [19:16] I was off work for 5 weeks...... and I think I started it before then :D [19:16] http://i.imgur.com/8Ni5n8Y.gif [19:17] Schbirid: is that an infomercial? [19:17] heh [19:17] yeah [19:17] i would LOVE to see their solution for that [19:17] big biker outside my house. [19:17] outlaws... [19:17] fromt he awesome r/wheredidthesodago [19:17] because i love how infomercials have really shitty solutions to problems you never knew you had [19:19] bai bai big scary biker. [19:22] looks like i have maximum cd 2004-09 [19:22] i didn't know i had it [19:23] cool [19:29] godane: woohoo :) [19:29] some one should archive this: http://www.lostlevels.org/ [19:30] i think i may have do this now [19:31] i was looking at the links it and there broken sort of [19:31] a link like this will not work: http://www.lostlevels.org/wordpress/?m=200510 [19:31] but this does: http://www.lostlevels.org/200510/ [19:35] wordp[ress? [19:35] see the blog [19:36] theres is a how to archive wordpress on there. [19:36] the blog? [19:36] errr wiki :D [19:36] not on [[WordPress]] [19:39] i think it' under warc [19:40] http://archiveteam.org/index.php?title=Wget [19:40] it was there :) [19:40] http://www.archiveteam.org/index.php?title=Software [19:40] xD [19:41] starting to upload the rare dvd rom of maximum pc [19:41] it has world of warcraft on it [19:42] *world of warcraft demo on it [19:42] godane: you upload a terrifying amount of stuff [19:43] i know [19:44] godane: a good thing of course :D [19:46] i'm going to be getting another set of maximum cds [19:46] somewhere between 1999 and 2001 i think [19:48] where are you finding them? [19:48] ebay [19:49] ah :) [20:02] http://www.kickstarter.com/projects/1748556728/the-untold-history-of-japanese-game-developers/posts/514403 [20:03] hi Ravenloft [20:05] hm, dat mould :< [20:05] (or mold? i'm not sure if it's "mould" even in british english) [20:06] looking at the pattern of it, though, it looks like a lot like the mould you get on camera lenses if they're left in the damp [20:06] it's going to be from a finger print I think [20:06] theres nothing on a magnetic disk to feed on [20:06] same way they found those ants eating cables because they were slightly oily from where someone had handled them [20:06] Smiley sounds removable then [20:06] and that shit *eats into the coatings on glass* [20:07] winr4r hi [20:07] coatings on glass? [20:07] Smiley: yes [20:07] sourcE? [20:07] on lenses, you can remove the mould, but it actually eats into the coatings and so even when the mould has gone you still end up with damaged coating [20:08] soultcer: that glass has coating, or that mould eats into it? [20:08] shoops, i meant Smiley* [20:08] s [20:09] okay, i officially can't type tonight [20:09] both really :D [20:09] Smiley: lenses with coatings go back way more than half a century so no source for you on that [20:10] Ok, just on the mold eating it plz [20:10] http://www.truetex.com/lens_fungus.htm [20:10] sounds fun [20:15] you were right that it was probably from a fingerprint, but when it gets on there the stuff just keeps eating [20:19] what i was getting at, is that i'd be very very surprised if anything is recoverable after removing the fungus [20:20] but none of this should be read as "you shouldn't try" [20:23] generally for lenses the procedure for dealing with fungus involves the bin, unless it's some particularly special glass. but if you're going to use one, best to do it on a sacrificial camera and keep them away from clean lenses in case spores spread [20:23] seems to me a similar approach would be useful with that diskette... [20:23] Baljem: bit late for that [20:24] yeah [20:25] but I meant, if you found that somewhere and wanted to try reading it anyway, keep it the hell away from any kit or diskettes you care about ;) [20:25] yes [20:25] good point [20:26] Baljem: and when it comes to glass, once the fungus is gone, you'll find that lenses are pretty usable if you try hard to avoid flare, since it usually only gets the coatings [20:26] and on simple lenses, even if it gets the glass itself, that's not so bad [20:27] mm, yes, although getting the fungus off it might be more effort than makes sense unless the glass is something special [20:27] see also: every photograph made before about the 1950s, which usually had uncoated lens [20:28] e.g. a 50mm f/1.8 for a common mount, probably just bin the bugger and pick up another for £5 [20:28] +es [20:28] night all [20:28] night Smiley :) [20:29] I've got an apparently-very-nice 70-210 sitting in the garage that I was looking forward to using when I 'inherited' it - except it's now totally opaque from fungus - I hate to think how long it'd take to clean :( [20:29] night Smiley [20:29] Baljem: what mount? [20:29] Canon FD [20:29] come to think of it, I have an actual New FD 70-210 in my collection anyway, so it's no great loss. but apparently this one was one of the good Vivitar Series 1s [20:29] Baljem: hah, funnily enough i bought a T50 with a 70-210mm as well, the 70-210 was fungused beyond belief [20:30] if i didn't already own a whole bunch of other FD stuff i'd be upset with that given that i overpaid [20:30] Canon 70-210s are ten-a-penny, mine was free with a New F-1 ;) [20:30] ouch, overpaid for a T50? :S [20:30] I think I paid £25 for mine, although that was with a 50/1.8 so meh [20:30] yeah, i got a T50, 70-210mm, 35-70mm, and a bag [20:30] what happened was [20:31] i took my chances once [20:31] I definitely overpaid for my T90 though, seeing as I actually won two auctions and the chap couldn't find the second one and I completely forgot to sort out a refund :-/ [20:31] i was in a pub with a guy and he said he had a "canon 50" that his dad had with two lenses [20:31] so i figured: either a canon EOS 50 with two lenses, or a T50 with two lenses [20:31] but i committed to it on the spot, taking the risk [20:32] i ended up with the more worthless of the two! [20:32] oh dude, i had a T90, i LOVED that thing [20:32] heh - well, it was worth a punt anyway ;) [20:32] yeah, the T90 is nice [20:32] i did that thing where i tried to rationalise my camera collection [20:33] out of 15 cameras i could only think of one to sell, which was the T90, because i already had an A-1 [20:33] mine has a weird fault - everything works fine except the multi-spot metering, yet there's nothing wrong with the actual button (I checked for continuity on the flex board under the palm door) [20:33] Baljem: hah, i never figured out how to work the multi-spot metering [20:34] but damn, the T90 was a beast, i LOVED that thing [20:34] I love the A-1 except I find it difficult to use nowadays - I can barely read the red viewfinder display. the T90's is a much more useful colour (I struggle with red in general) [20:34] Baljem: ah :) [20:34] still, got two A-1s in the collection. mine and one that came with five other A-series cameras ;) [20:34] the A-1 is my favourite camera of all time because of the simple viewfinder display [20:34] haha, nice :) [20:35] i loved how it was just aperture and shutter speed [20:35] the T90 had more shit in it as i recall, so i liked it less [20:35] don't forget EEE EEEE if you mess up with the stop-down lever ;) [20:36] Baljem: hahaha yup! [20:36] you guys have any of the newer cameras? [20:36] dude [20:36] i put it on my site telling them how to fix the EEE EEEE error [20:36] I need to choose a body to try my latest lens out with -- FD 20mm f/2.8 -- should be fun [20:36] and i think i've gotten like five emails thanking it for me :) [20:36] goodness knows how many other people that *didn't* email me to thank me [20:36] omf_: my most recent is an EOS M compact system camera, and a EOS 50D for proper stuff [20:37] and then 40+ film bodies for fun [20:37] I got rid of all my old cameras, just taking up too much space. I keep two now. [20:37] A Canon rebel t2i for joking around and a 5D Mark 2 when I am getting paid [20:37] nice [20:38] I couldn't justify the expense of a 5D although it's bloody tempting [20:38] It is fucking awesome [20:38] you can install gpl firmware and do crazy shit [20:38] I'm tempted to try the Magic Lantern firmware on my 50D. apparently it now enables video, which makes sense... it came out about a month before the 5D Mk II [20:39] i'm torn between another D2H and a pentax K-x [20:39] i'm cheap, bitches [20:39] nothing wrong with cheap :D [20:39] (i killed my D2H on 401,000 shutter actuations, got a D1 from a reader of my site, killed that) [20:39] nothing wrong with wanting appropriate value for your money [20:40] I'm currently pondering the wisdom of a £35 set of 24 Cokin P-style filters, adapter rings and holder vs. paying proper money to buy the real thing [20:40] Now I am looking into expanding my lens collection [20:40] I had a lot of them for my film nikon. I stuck with that for years before jumping to the 5dmk2 [20:41] way I look at it, if I want to cover all my lenses I need 7 sizes of adapter ring. that's 70 quid for Cokin or about 25 for Kood from a local supplier. then add the holder and the two or three filters I actually want... hrm. [20:41] omf_: hey, they'll mount on an adapter to your 5d2! :D [20:41] nikon mounting on a canon? When did that get worked out [20:42] omf_: the flange-to-focal-plane distance is shorter on canon than nikon [20:42] short enough that you can get an adapter [20:42] as long as it's not a G lens [20:42] the Chinese turn out adapter rings for all sorts of lenses these days. just have to make sure the registration distance is different in the right direction [20:42] you get stopped-down metering [20:42] yeah, what winr4r said ;) [20:43] that's the downside of FD glass - FD has one of the shallowest registration distances of all SLR mounts so basically only mirrorless systems are compatible without introducing extra optics :/ [20:43] Baljem: on the other hand, *everything* is compatible with FD :D [20:43] even M42! [20:44] yep :D [20:44] I just love how advanced cameras have gotten in the digital age. I hated fucking changing rolls of film [20:44] canon made a whole bunch of adapters! [20:44] I still have boxes of photos left to scan in [20:44] they never made one for the Canon 7's special bayonet mount though [20:44] omf_: changing film is the best part! [20:44] so the 50mm f/0.95 is out of bounds... unless you mangle it with a new mount, as many people do [20:44] but mine will remain unmolested otherwise I couldn't use it on the 7 any more, doh [20:45] Baljem: oh god, don't tell me people actually mangle one of those lenses [20:45] that's like butchering a ferrari so you can put a jet engine in it [20:45] i mean i kind of see the logic behind wanting to do so, but nooooo [20:46] yeah. I mean, it's not like there aren't better options like the Noctilux [20:46] which is apparently sharper. but hell, if you're worried about sharpness at f/0.95 you may be doing something wrong! [20:46] buh, if you want clean shots in no light without a tripod use a digital camera [20:47] http://lewiscollard.com/img/class365.jpg [20:48] as i seem to have a couple of film folks in the house :) [20:49] winr4r: love film and digital [20:49] Tephra: same :) [20:50] nice - you local(ish) then, or did you have to venture far to catch the train? ;) [20:50] Baljem: local to that shot? [20:50] yeah [20:50] about 7 miles south [20:50] next stop on the way :) [20:51] i had my A-1 with with me, and a tripod, from shooting a bunch of other stuff that day [20:51] 15 seconds at f/9.5 as i recall, on kodak ektar 100 [20:52] I always wondered why so many hobbyist prosumers just take pictures of flowers [20:52] I hadn't thought about it before until I saw a recent post making fun of it [20:53] omf_: i always wondered why people actually care about what other people take photos of [20:53] omf_: in my case, because there are a lot of them around and the buggers don't move ;) [20:54] I mean I understand earth porn [20:54] I need to practice portraiture as I suck at it, but nobody wants to sit for me. bah. [20:54] omf_: if you want to see how much sleep i have lost over that, here it is: [20:54] okay, there [20:54] and I /really/ need to start planning places to go to take pictures. my back garden is getting a tired of being used for test shots ;) [20:55] if you're actually bothered by what other people take photos of, you're spending too much time thinking about that and not taking pictures imo [20:55] heh [20:56] it's kind of like a race car driver going around and being all like "wow, these people driving on motorways at 70mph, that's kind of weak" [20:56] winr4r, did I say I was bothered by it? No. Stop reading into things that are not there [20:57] "I always wondered why so many hobbyist prosumers just take pictures of flowers" [20:57] yeah wondered != bothered [20:57] wondered = interested, I want to the know the why [20:57] omf_: as a geologist this is earth porn: http://letslearngeology.com/website/wp-content/uploads/2013/01/folds-pennsylvannia.jpg [20:57] the obvious answer is that they like flowers [20:58] Tephra, instant boner [20:58] Tephra: where is that? [20:58] yes rock hard..... ;) [20:58] epitron: I'm just saying that it's impossible to know what will be "good to have" in the future. Better save/archive too much than too little. [20:58] epitron: One can always change ones mind about data one has saved. You can't do that with lost data. [20:59] winr4r: just random fold somewhere on google images, pensylvania the image says [21:00] ersi: if we keep doing that, we'll have so much information that we'll never be able to find anything :) [21:00] epitron: I don't see a problem with being a hoarder. [21:00] hahah [21:00] okay, there's where we differ [21:00] epitron: Besides, there's very few hoarders who contribute time to doing what we do here. [21:00] We don't hoard shit into a silo [21:01] yeah the silo is for storing the bodies of our enemies [21:01] I don't think any of the ArchiveTeam people hoard, because of the sake of hoarding. Even if I immediately thought of godane when I wrote that.. [21:01] haha [21:01] we turn them into mummies and then burn the mummies for fuel [21:01] omf_: Don't forget about the raped servers [21:02] and bandwidth bills of our enemies [21:02] i agree with a lot of the archiveteam's values [21:02] i like the idea of rescuing people's personal blogs [21:02] my major concern was triage [21:03] Well, that's not a primary concern. There's plenty of time to worry about that [21:03] i think it's kinda pointless to archive tumblr -- especially without archiving the images -- when tumblr isn't going to shut down, and there are many other long-tail RSS feeds that will disappear [21:03] triage is actaully kinda the definition of primary concern ;) [21:03] But yes, I think about that a lot as well. And I hope someone will triage and polish our sometimes stupidly large sets of data to something nice [21:04] epitron: How do you know they're not shutting down any time soon? [21:04] they're not shutting down before google reader [21:04] No, that seems unlikely. I didn't say before Reader is cancelled though. [21:04] tumblr images stay up even after tumblr blogs are deleted [21:05] ersi: i'm just talking about triaging what's being scraped from greader [21:05] epitron: Oh, well say that then. :D [21:05] ivan`: okay, so people who deleted their blogs by accident will find value in those feeds [21:05] epitron: I'm speaking about the general data [21:05] that's probably a very very very small number :) [21:06] i'd actually look at those deleted feeds to see if they have any value [21:06] As in all kind of data hoarding projects we do [21:06] ersi: hey, if you guys want to store everything ever, go nuts :) [21:06] i think it would be nice to find the valuable things and treat those specially [21:06] epitron, we don't question the value of what we save because we are too busy trying to save shit. The value decision comes later, if you disagree you can always just leave [21:07] we are not curators [21:07] I'm just saying that it's likely there's very little short time value. But maybe there's long time value. We don't know. [21:07] epitron: these goals are not mutually exclusive [21:07] We're hoarders, bandwidth vikings and barbarians [21:07] hahah [21:07] Grape and pillage [21:08] Grep and pillage [21:08] epitron: the problem is to determine value isn't? [21:08] this is how hardcore we are - http://imgur.com/gallery/h04og [21:08] jason scott in viking hat, sword in the air, "ARCHIVE ALL THE THINGS" [21:08] Scrape and Pillage [21:09] speaking of all the things, youtube should be part of that [21:09] perhaps just 480p for the videos epitron doesn't like [21:09] ivan`, the internet archive already does some of that [21:09] ersi: not sure a reference to rape should be in the topic [21:09] i agree that storage is increasing exponentially, and we have the ability to store these ancient things. the problem is that when something is going to be imminently deleted, you have to make some decisions about what to archive. [21:09] epitron one man's trash is another man's treasure [21:10] i think the RSS images should be archived as well [21:10] Is that better then? [21:10] ersi: better :) [21:10] epitron: but you did not write any code to archive them [21:10] Ravenloft: some things are trash to everyone ;) [21:10] * epitron gives Ravenloft a rotten banana peel [21:10] and, let's all agree to not mention archiving youtube [21:11] * ersi takes the banana peel and burns it in his furnace [21:11] ivan`: i've offerend a bunch of times to archive OPML [21:11] mmmh, heat [21:11] because if that time comes, we're going to be like a guy with a BB gun staring at the ICBM force of the united states [21:11] epitron maybe for everyone today, but we cant know what will be valuable 100 years from now [21:11] winr4r: I download most of the youtube videos I get linked and/or watch [21:11] it's not going to work out too well [21:11] ersi: i try to, too [21:12] ersi: do you have a good tool for youtube? [21:12] "Homer, why are you storing all those old calendars?" "Marge! We may not need them today... or tomorrow.. but who KNOWS what the future will bring!" [21:12] Tephra: youtube-dl [21:12] https://github.com/ludios/youtube-dl has a thing for re-downloading users without getting captchad [21:12] Tephra: I use youtube-dl. They keep on a good fight against YouTube [21:13] I rest my case, the point is that is hard to know beforehand what the future would consider valuable, so, if we can, why not preserve everything? [21:13] thanks! [21:13] i'm sure that the future, there will be generations of 14 year olds creating lots of crap, and people will be hoarding that without reading any of it as well :) [21:13] *in the future [21:13] except it'll be super high bandwidth crap [21:13] with atomic precision [21:14] ivan`: I assume ludious == you, correct? ;o [21:14] yes [21:14] oh, it's an org [21:15] silly that one cannot follow organisations. I'll add your personal one then [21:15] heh, weird [21:19] http://bulbanews.bulbagarden.net/wiki/TRsRockin_shuts_down [21:19] I raged. [21:19] (I remember discovering this site when I was like ten, the glitch story archives were awesome.) [21:19] is it in IA? [21:20] nope, robots.txt ;( [21:20] Yeah, a domain squater made IA delete it IIRC. [21:20] I think a rogue robots.txt too [21:20] Or maybe it was Rose herself. [21:20] Either way it made me pretty angry. [21:20] can IA please be configured to not block domain squatted robots.txt files? :( [21:20] It could be saved regardless. I don't think IA deletes robots.txt'ed sites. [21:21] But I also do not know and everyone hates talking about this stupid shit [21:21] ersi: What stupid shit? [21:21] IA could be reconfigured for this imho [21:21] Of course it could [21:21] This of all things should be on-topic for archive team. [21:21] Even if it is just a stupid fan site. [21:21] namespace: I'm not saying bulbanews/TRsRockin is stupid shit [21:21] I meant the robots.txt-shit [21:21] Oh. [21:22] there are A LOT of sites which are unavailable because of this very issue. [21:22] yes [21:23] Unavailable, sure. We don't know if they're still there or not though. [21:23] ia doesn't delete anything [21:23] http://web.archive.org/web/20080515000000*/http://TRsRockin.com/robots.txt [21:23] http://web.archive.org/web/*/http://TRsRockin.com/robots.txt even [21:23] if you want to see what a big tragedy looks like when someone uses robots.txt, look at fotopic.net [21:24] they were probably the biggest host of railway and bus pictures when they went down [21:24] fotopic.net currently has no robots.txt [21:24] stuff going back to the 1950s [21:24] then they, somehow, managed to lose money with a business model of people giving them money [21:25] and then they died [21:25] I wish there were more wikis good looking and content rich as Bulbapedia [21:25] xmc: I dunno, neither is confirmed [21:25] balrog: all the sites were hosted on subdomains had a robots.txt that would return a 500 server error [21:25] ersi: whatever [21:25] so, it is all lost [21:25] all of it [21:25] I have other shit to do than argue with you on the internet [21:25] xmc: Just saying. [21:25] Ravenloft: That was the only site that had a news blurb about it shutting down. [21:26] xmc: I'm not arguing with you. I was just saying that I have not seen either being confirmed nor denied. [21:26] have not seen what? [21:26] winr4r: :( [21:26] what we're complaining about is something else though [21:26] balrog: i'm talking dozens of photos i've used in my own research too [21:26] we're complaining about squatters using robots.txt [21:26] to "remove" archived sites from IA [21:27] balrog: oh yeah, fuck those guys [21:27] ersi: [21:27] xmc: That the Internet Archive would delete data they are not "allowed" to show any more. I find it extremely unlikely and my personal belief is that they just "dark it" [21:27] your intuition is correct [21:27] stop sowing confusion [21:27] imho, IA should have a list of known squatters, and refuse to retroactively respect robots.txt for those, and only those. [21:27] of course robots.txt would be respected for new crawls [21:28] But then again, it's always balrog that takes up this stupid discussion all the time [21:28] hah [21:28] balrog: and how would you define that? [21:28] * ersi shrugs and shakes his head [21:28] ersi: hey, that's not nice [21:28] balrog = m. bison [21:28] winr4r: he has a point though [21:28] winr4r: Reality isn't nice [21:28] And this discussion is stupid because we can't do anything about it. [21:28] correct [21:29] agree to disband [21:29] balrog: Mail brewster about it [21:29] Fine. [21:29] namespace: I do think it sucks that it wasn't scraped though [21:29] ersi: i usually get that line from people who are looking for an excuse to be unpleasant [21:29] the TrSRockin site that is [21:29] Well the irony is that I went to visit to grab it. [21:29] Learned it went offline two years ago. [21:30] winr4r: I try to only get away with it, if I actually have a point - and that point being said multiple times to the same people [21:30] god these squatters are becoming nasty [21:30] if you give them a curl UA, they tell you 404 [21:30] I'm occationally just stupid and mean to Blue Max regardless, though [21:30] if you give them a browser UA, they give you a nasty redir [21:30] balrog: "are becoming"? [21:30] the thing is, should robots.txt follow the Asimov's robotics laws? [21:30] So give them a real user agent then [21:30] they're fucking *squatting domains* [21:30] of a known browser version [21:31] you're like the guy walking down skid row and asking why people aren't being nice [21:31] balrog: When your business model is predicated on being a dick, expect dickery. [21:32] Hey, what's up with all the misandry! [21:32] his business model is shadaloo [21:32] That's not very "PC" etc *shakes fist* [21:32] I expect an equal ratio of cuntery as well as dickery [21:34] actually equal would be unfair, the ratio should mirror the % of males and females on society [21:35] How do you know I'm not talking about a ratio equal to zero [21:35] wouldn't that be fairest? [21:35] Okay, now *this* is stupid shit. [21:36] :P [21:36] I do have a point thought [21:36] -t [21:45] not all dicks are created equal, e.g. Moby Dick [22:00] Looks like China is getting into the scraping business. [22:01] cryptome.org/2013/02/prc-scraping-cryptome.htm [22:05] they don't need to scrape -- cryptome guy will mail you a DVD :) [22:06] Exactly. [22:06] That's part of why it's funny. [22:06] heh [22:06] it's probably just some random dude :) [22:07] The cryptome guy must have to deal with a mountain of shit like this. [22:07] he should just post a torrent [22:07] saves bandwidth [22:07] probably won't cut into his sales either [22:08] Well, like the text goes, he doesn't mind people wgetting cryptome either [22:08] just don't do it every damn day :D [22:08] epitron: I bet if he posted a torrent the PRC would still scrape his shit. [22:09] I have to wonder if they're not doing it as a form of DDOS. [22:09] hmmm [22:09] the scrape does sound very strange [22:10] "checking for new files and downloading them as well as repeated downloads of hundreds of random files with no discernible pattern" [22:10] it's either a very bad scraper, or a weak, deniable DDoS [22:10] -D [22:11] hah [22:11] maybe someone is trying to frame him for giving the chinese SECRETS [22:11] "THE CHINESE DOWNLOADED 5 TB OF SECRET DOCUMENTS" [22:11] china runs a lot of shitty scrapers [22:11] btw [22:11] "CRYPTOME GUY HAS BEEN ARRESTED AND FACES A SECRET TRIBUNAL" [22:11] I've had some get the same file dozens of times within under a second from my server [22:11] usually from china [22:12] They don't even need to do a deniable DDOS, they obviously have the ISP's in their pocket. [22:12] I'm surprised he didn't get back a response telling him the traffic doesn't exist. [22:12] namespace: it's not a DOS as much as wasting his bandwidth :) [22:14] it would be nice if there was some standard HTTP extension that let you get a sitemap to crawl, and could tell you about new files [22:14] INDEX / [22:15] or a better protocol, like git :) [22:18] I would laugh if in the future people just started using git to host content. [22:19] hahah [22:19] i wouldn't put it past humanity [22:19] "I'd like to submit a pull request for the front page of google." [22:20] that would be a big repo [22:20] Have you ever seen the source for that? [22:20] It's...art. [22:20] git clone google.com/.git => "packing objects: (1231238/19847398172102987410982749017234)" [22:20] the source for google? :) [22:20] Yeah. [22:20] View source on the google home page, right now. [22:21] it's very minified :) [22:21] the CSS tags are named such that they have maximum huffman compression [22:21] s/tags/names/ [22:21] *classes [22:21] Like I said, art. [22:22] i wouldn't call it ART.. i'm not sure what their message is. but it's definitely the result of a bunch of people who are too smart to be making websites [22:25] http://mozilla.github.io/pdf.js/web/viewer.html [22:25] lol [22:26] pdf.js [22:27] epitron: if it kills the adobe PDF plugin, i'm all for it [22:27] pdf.js is good. [22:27] I was actually really surprised how good it actually is. [22:27] yeah [22:27] *Remove one actually [22:28] i was looking for something to convert PDFs to epubs [22:28] if values of "it" are nuclear weapons then i am *still* for it [22:28] this might be a good starting point [22:28] epitron: you're on a losing battle there [22:28] winr4r: Have you tried it yet? [22:28] winr4r: it would have to have manual intervention [22:28] namespace: tried pdf.js? yes [22:28] it would need to be a UI that did semi-automatic conversion and region recognition [22:29] The first time it loaded up my jaw dropped. [22:29] but epubs are just html :) [22:29] Reading PDF's was a PITA. [22:29] namespace: it is quite impressive [22:29] namespace: yes :) [22:29] i wonder if it has JS exploits :) [22:29] hope not [22:29] but it says something [22:30] when adobe's PDF plugin is so hideously fucking slow and memory-hogging and all the other bad shit [22:30] that an interpreter written in fucking browser javascript can make it more bearable [22:30] did you know that acrobat/plugin is the only product adobe has developed in-house? [22:30] everything else is from companies they bought [22:31] that explains a lot, for me :) [22:32] It does explain a lot, actually. [22:32] it doesn't explain anything at all [22:32] epitron: By the way, Firefox actually uses PDF.js to display PDFs these days, in-browser. [22:32] acrobat/plugin is an awful piece of shit [22:32] flash is an awful piece of shit [22:33] ersi: Yes, and it's awesome. [22:33] flash is fast, at least :) [22:33] everything adobe does is an awful, awful piece of shit [22:33] namespace: Could be a tad faster, but it's damn fast for what it does [22:33] namespace: I concur. [22:33] uh, weren't PostScript / Type 1 / Adobe Type Manager etc. in-house? [22:33] winr4r: big companies basically can't make software. [22:33] all they can do is eat startups, and slowly corrupt the programs over time [22:33] epitron: that's not true [22:33] *basically* :) [22:33] not precisely [22:34] epitron: no, you're entirely wrong [22:34] NO U [22:34] U! [22:34] U^2! [22:35] jason is going to murder all of us when he gets back to a computer [22:35] Meh! [22:35] THIS! [22:35] IS! [22:35] AT-BS! [22:35] hmm, and taking a gander at Illustrator's history too, so was that apparently [22:36] ersi: *GETS KICKED DOWN A HOLE* [22:37] \o/ [22:39] winr4r: Why would Jason murder us? [22:39] It's not like we let Posterous/Xanga/etc burn. [22:41] Baljem: illustrator was in-house? [22:49] apparently so. a commercialisation of the tools they used themselves to create PostScript fonts [22:50] (although I only glanced at the fountain of knowledge that any idiot can piss in to confirm that... so take it with a pinch of salt, perhaps. but PostScript is what the company was formed to do) [22:52] "fountain of knowledge that any idiot can piss in" [22:52] Love it. [22:53] ;) [22:55] Ahhh, Wikipoolia [23:27] I love how Adobe flash always works with Pulse Audio. That is why it is such a great application. [23:27] * omf_ shoots self in face [23:27] Did you know Apple bought 19% of Adobe in the 80s, they are partially to blame [23:45] mumble mumble pulseaudio mumble mumble