#archiveteam-bs 2013-07-09,Tue

↑back Search

Time Nickname Message
00:01 🔗 godane pc in cardboard box: http://imgur.com/a/wtd4Q
00:18 🔗 omf_ I like
00:58 🔗 joepie91 godane, servers in cardboard boxes: http://itknowledgeexchange.techtarget.com/IT-watch-blog/fdcservers-colocation-data-center-gives-new-life-to-the-term-boxen/
01:23 🔗 DFJustin the alienware sticker really makes it
01:35 🔗 Panaosnic http://www.thespaceinvaders.org/
03:19 🔗 underscor When you say the word "poop" your mouth does the same motion your butt hole does when you poop.
05:39 🔗 omf_ lol
05:40 🔗 omf_ POOP
06:12 🔗 * omf_ throws a pancake at SmileyG
06:13 🔗 * GLaDOS pancakes omf_ back with a SmileyG
06:21 🔗 omf_ I wonder how hard it would be to convince a client to buy me the Jakob Neilson reports
06:33 🔗 omf_ Is the Kindle Paperwhite still the best ereader in terms of legibility? I am looking into finally getting an ereader instead of my laptop
06:35 🔗 omf_ I tried the latest nook out and I was not impressed
06:57 🔗 * BlueMax pancakes omf_
06:57 🔗 godane so i'm passed 54k videos in my g4video-web collection
06:58 🔗 godane also i will be able to start uploaded the 'g4 unpublic video ids'
07:38 🔗 SmileyG hmmm
07:38 🔗 SmileyG I have forgotton where to get the S3 script
07:39 🔗 SmileyG ah sorted it :P
07:44 🔗 godane uploaded: https://archive.org/details/17Bit_Collection_A
08:08 🔗 godane uploaded: https://archive.org/details/17Bit_Collection_B
08:20 🔗 omf_ So my spare computer fried
08:20 🔗 godane i'm grabing the linux format dvds again
08:20 🔗 omf_ and I need to hook up an internal bluray drive so I pull out my backup from the closet. That bitch is so old it is running windows XP
08:21 🔗 omf_ The last time I used windows XP on a computer was 2003
08:21 🔗 omf_ I think I need to do some computer upgrades and get a more modern backup machine
08:22 🔗 omf_ I have had more computer failures this year than any year I can remember, it is such a bummer because we all get so used to stuff just working
08:23 🔗 omf_ We always go on about backing up data but what about having spare fulling working computers around
08:23 🔗 omf_ shit I gotta find an IDE hard drive
08:32 🔗 BlueMaxim hey SketchCow, you around?
08:36 🔗 godane so i'm uploading some of the dual side linux format dvds i had
08:36 🔗 godane disk 80, 88, and 138
08:54 🔗 omf_ The machine is so old it does not have sata on the motherboard and the sata card I have will not let me boot off the bluray drive ugh.
08:56 🔗 ersi boot.. off of bluray? woot
08:56 🔗 omf_ a linux live dvd
08:57 🔗 ersi ah, so not a bluray disc - multi-format drive ftw
08:57 🔗 omf_ I had over 25% of my bluray backup disks fail so I switched up to raid hard drives for backups which works great but I still have more disks to go through
08:58 🔗 ersi I don't have a single unit that can eat bluray
08:58 🔗 omf_ it is not worth it
08:58 🔗 ersi that's what I figured :)
08:58 🔗 omf_ I tried three different brands and they still suck
08:59 🔗 omf_ the media is just too fucking cheap
08:59 🔗 omf_ I am just trying to finish wrapping up a multi month media move
08:59 🔗 omf_ I want it over so I can pitch these bluray disks and drives
09:00 🔗 ersi sounds like a plan
09:00 🔗 omf_ never had this problem with cdrs or dvdrs
09:01 🔗 omf_ I tested some dvdr backups from 13 years ago and they still worked 100%
09:01 🔗 omf_ and yet a fucking bluray disc can go bad in under 2 years
09:01 🔗 ersi well, they figure most of the people who wanna back up that amount of data will never check it
09:01 🔗 ersi ;D
09:02 🔗 omf_ I just do not get how bluray can be such an abortion
09:02 🔗 ersi me neither
09:02 🔗 danneh_ well, they probably don't mind if the blurays go bad
09:02 🔗 omf_ then I remind myself that fucking SONY invented that shit
09:02 🔗 danneh_ goes bad? buy another one
09:03 🔗 omf_ HDDVD
09:03 🔗 omf_ how I miss you
09:04 🔗 GLaDOS I just chip my backups into rocks.
09:04 🔗 omf_ I thought about 32 or 64gb thumb drives but those can be inconsistent as well
09:04 🔗 ersi I just pour mine into the cloud
09:04 🔗 ersi I get it back when it rains
09:04 🔗 GLaDOS The cloud <3
09:04 🔗 ersi I don't always back up my data, but when I do, I put it in The cloud(TM)
09:05 🔗 omf_ Anyone ever seen a 24 hour computer parts store? we need that
09:05 🔗 ersi It's called the Internet and globalisation
09:06 🔗 omf_ they cannot get me the parts today so they fail
09:06 🔗 omf_ I can *walk* down the street and get 24/7 groceries or 24/7 pharmarcy
09:07 🔗 ersi Well, I'm technically correct since you didn't specify that's what you wanted :)
09:07 🔗 omf_ true ersi
09:07 🔗 ersi but yes, that would be awesome
09:07 🔗 ersi I bet they'd be expensive though :)
09:08 🔗 omf_ 24/7 works in large cities. I live in a city that has close to 2 million people, I would like to think it could support 1 store like that
09:08 🔗 ersi Stockholm has about the same population
09:08 🔗 ersi the transit system shuts down at midnight
09:08 🔗 omf_ There is at least 10 24/7 groceries in town, hell we got 24/7 bowling alleys
09:09 🔗 omf_ buses shutdown here at midnight too
09:09 🔗 ersi Yeah, but fuck driving a car in Sweden
09:09 🔗 omf_ but the road is empty so bicycles all the way down :)
09:09 🔗 ersi ah, heh
09:09 🔗 ersi Figured that might be silly if buying electronics
09:10 🔗 omf_ I got two bicycles. One of them has the rear carrying brackets so I can haul shit
09:10 🔗 ersi I've been thinking of making a bike cart for my bike
09:12 🔗 omf_ I am going to buy two of these adapters since they are only $9.99 each
09:12 🔗 ersi What's a "cash call"?
09:14 🔗 ersi Sounds like a "thing", saw fifty names looking like 1800cashcall<variation> in my LJ-scraper
09:14 🔗 ersi 1800 makes me think "1-800" like US phone numbers
09:15 🔗 omf_ they are a mortgage loan shit house
09:15 🔗 ersi ah
09:15 🔗 omf_ spam from them is not surprising
09:16 🔗 ersi One title from CashCallMortages; "American Dream Special"
09:16 🔗 ersi I laughed
09:20 🔗 omf_ lol I just found my copy of "The Dark Half" in this old dvd drive
09:20 🔗 SmileyG heh backup on blu rays?
09:20 🔗 SmileyG hell, back up on optical media?
09:21 🔗 omf_ quality dvdrs you get more than 10 years
09:21 🔗 omf_ but now they are too small for most things
09:24 🔗 omf_ I am definitely going to drop some serious coin before the end of the summer to have working backup machines
09:24 🔗 omf_ then again for most backup purposes a raspberry pi is enough
09:51 🔗 omf_ I do not even have a 32bit linux cd anywhere
09:52 🔗 omf_ fuck it, I'll just wait for the store to open to do more, at least I will have most of this wrapped up by the end of the day
10:05 🔗 godane what brands of bluray are you using?
10:06 🔗 omf_ I tried all the verbatim types, windata and one other I cannot remember. I also used 3 different bluray burning branded drives. I have extensive data I plan on publishing once I finish up this last stack of disks
10:07 🔗 godane i have some verbatim
10:07 🔗 omf_ I burned over 500 disks over the course of two years
10:08 🔗 godane i also use memorex also
10:08 🔗 omf_ If I had losts of spare money I would go back to tape
10:09 🔗 godane i don't have that option
10:09 🔗 godane if i'm luckly i can buy another 2TB usb drive in 3 to 4 months
10:09 🔗 omf_ I ain't got no money either
10:22 🔗 ersi Oh yeah, by the way: http://www.gamasutra.com/view/feature/195148/dwarf_fortress_in_2013.php?print=1
10:22 🔗 ersi Full of good gems
10:28 🔗 godane we may need to start archiving this guy's work: http://www.giantbomb.com/articles/ryan-davis-1979-2013/1100-4685/
10:30 🔗 godane his twitter account: https://twitter.com/taswell
10:39 🔗 godane so i'm mirroring the gaint bombcast podcast
11:13 🔗 godane so i got some good news on the giant bombcast podcast
11:14 🔗 godane all mp3s are one rss feed
11:14 🔗 godane page
11:14 🔗 godane and i can sed it to have keywords, title, desciptions and dates
11:15 🔗 godane so it will be like when i was uploading cbc spark
11:21 🔗 winr4r hey, jason was on cbc spark
11:22 🔗 godane here is the collection: https://archive.org/details/spark_cbc
11:35 🔗 omf_ the facebook graph search is the easiest stalking program I have ever used
11:36 🔗 omf_ yeah they are doing wide release of it now
11:39 🔗 omf_ never mind. This query did not work "My friends who like photos of Cats"
11:39 🔗 omf_ that is fucking basic
11:42 🔗 omf_ and I get pinterest style photo pages
11:42 🔗 winr4r oh shit no
11:42 🔗 omf_ what the fuck am I seeing here
11:42 🔗 winr4r pinterest-style unaligned columns are a cancer
11:43 🔗 omf_ it is psychological warfare
11:43 🔗 winr4r i start to doubt myself, like if that's just me spergin' the fuck out, given how eeeeveryone is doing it now
11:44 🔗 omf_ most websites have shit ux and no taste
11:44 🔗 winr4r hey guys, github or bitbucket
11:44 🔗 omf_ INFINITE SCROLL
11:44 🔗 omf_ we got most or stuff on github
11:45 🔗 omf_ I like looking at photos sorted by region where the person lives
11:46 🔗 winr4r okay github it is then
11:46 🔗 omf_ I am a super huge graph database nerd
11:46 🔗 omf_ https://github.com/ArchiveTeam/
11:48 🔗 omf_ I am now going state by state because I can
11:48 🔗 omf_ I bet they have freebase tied into this
11:48 🔗 winr4r i suppose if you're all using git, then it makes sense to use git, and github is the obvious choice for a repo
11:48 🔗 winr4r so there it is
11:49 🔗 omf_ well look what I found. Slutty Halloween pictures.
11:49 🔗 omf_ ^_^
11:51 🔗 Baljem hmm. does it recognise 'big knockers' as an attribute when searching for slutty Halloween pictures? ;)
11:51 🔗 omf_ let me try
11:51 🔗 omf_ this search just popped up in autocomplete
11:51 🔗 omf_ " Photos of my friends who have liked ArchiveTeam "
11:52 🔗 omf_ fucking skynet
11:52 🔗 Baljem hah
11:52 🔗 Baljem that's curious phrasing. "have liked" as opposed to "like"...
11:53 🔗 winr4r yes
11:53 🔗 omf_ "liked" is a facebook flag
11:54 🔗 omf_ I "liked" Rage Against the Machine
11:54 🔗 webjunkie they are trying to make the search "natural" language
11:54 🔗 omf_ most people this shit is still too hard but I think it is pretty simple to use
11:55 🔗 omf_ then again I am a programmer
11:55 🔗 webjunkie Yeah, I've had the preview for months and I found it super easy
11:56 🔗 webjunkie like you said...if you want to creep on people it makes it insanely easy
11:57 🔗 Baljem mm, I can understand why they use that as the flag, but they use the "like" form a lot too; "So-and-so likes Something You've Never Heard Of", etc.
11:57 🔗 omf_ I am going to write a program to login as me and markov chain the fuck out of that search
11:57 🔗 Baljem which does it use in results, if any?
11:57 🔗 omf_ I can search by state but not by region of the country
11:58 🔗 omf_ they must not be using freebase because it has identifiers for regions and the like so it is possible to use that as a search criteria
11:59 🔗 godane holy crap
11:59 🔗 godane i may have figured out how to do muliable line descs now
12:00 🔗 godane add this for link brake: &#xD;&#xA;&#xD;&#xA;
12:03 🔗 SmileyG o_O????
12:03 🔗 SmileyG Sent 3156111360 bytes (1%)
12:04 🔗 SmileyG lolololol
12:04 🔗 omf_ yeah
12:04 🔗 godane i only found out about it cause i'm using a older version of cbc podcast desc page
12:04 🔗 SmileyG i can't remember where to view the S3 stats
12:04 🔗 omf_ I hope the store has the raspberry pi power adapters in stock
12:04 🔗 omf_ SmileyG, that page is offline due to security issues
12:05 🔗 SmileyG omf_: doh!
12:05 🔗 omf_ nevermind they put it back up
12:05 🔗 omf_ must have fixed it. http://archive.org/stats/s3.php
12:05 🔗 omf_ http://home.us.archive.org/~tracey/mrtg/ is back too
12:06 🔗 winr4r https://github.com/lewiscollard/mwlinkscrape
12:22 🔗 ersi http://highscalability.com/blog/2013/7/8/the-architecture-twitter-uses-to-deal-with-150m-active-users.html
12:23 🔗 ersi Holy moley
12:30 🔗 SmileyG At any given time there’s about 1 million sockets open to the push cluster. :O
12:30 🔗 ersi 22MB/s data on ze firehose
12:30 🔗 SmileyG yah
12:31 🔗 ersi 176Mbit/s~
12:31 🔗 SmileyG 1 million sockets, so even if we presumed consumers like... google, have 1 per server for a large part of what ever process they are sucking them down via.... DAMN thats a lot of people listening
12:36 🔗 winr4r ersi: very interesting
12:39 🔗 SmileyG heh, awesome for testing our line here :D
12:40 🔗 ersi SmileyG: Testing your line?
12:40 🔗 SmileyG ersi: at work
12:40 🔗 SmileyG 500Mbit
12:40 🔗 godane does anyone know about the cbc spark missing podcasts yet?
12:41 🔗 godane there are still missing podcasts on there from what i can tell
12:41 🔗 winr4r nope
12:41 🔗 winr4r also: http://officialpiczoblog.piczo.com/
12:42 🔗 winr4r this has been "soon" since october 2012
12:42 🔗 ersi SmileyG: You won't get access to the Firehose
12:43 🔗 ersi So dream on
12:43 🔗 winr4r wat do?
12:45 🔗 SmileyG ersi: Oh I know that
12:45 🔗 SmileyG just saying it'd be neat.
12:47 🔗 Baljem the line "Internal clients use roughly the same API as external clients" amuses me somewhat
12:48 🔗 Baljem didn't Microsoft try that excuse with the DoJ a while back? ;)
12:48 🔗 ersi roughly usually means "does not use anything similar to the public APIs"
12:48 🔗 ersi Baljem: Uh? What?
12:49 🔗 ersi It's usually a good idea to use the same API for external as well as internal things, you get a lot better quality cover
12:49 🔗 ersi not always, but usually
12:49 🔗 Baljem I seem to recall one of the MS anti-trust claims revolved around internal APIs allowing MS apps to do things that 3rd party apps couldn't
12:49 🔗 ersi Kind of different when you're installed on every damn computer
12:50 🔗 Baljem which is why there's now a slew of MSDN pages warning about the functions they document being subject to random breakage, etc.
12:50 🔗 Baljem well, sure, but the hype behind social media would suggest that it's not all /that/ different ;)
12:50 🔗 Baljem (I'm not *entirely* serious here, you understand, just musing randomly)
12:50 🔗 ersi It's not all that different from Operating Systems?
12:51 🔗 Baljem ... no, it's not all that different from the point of view of blocking competitors apps
12:51 🔗 Baljem which, as I recall, Twitter has recently been doing with quite some glee
12:52 🔗 Baljem I mean, OK, from one point of view it's just a load of whining about ephermal nonsense that nobody cares about
12:52 🔗 Baljem but on the other hand, Twitter /is/ fairly big and dominant in its sector, isn't it?
12:52 🔗 winr4r Baljem: so was myspace, so was friendster, so was geocities
12:53 🔗 Baljem ah, but did they provide an API and (initially) encourage 3rd party developers to write competing client apps?
12:54 🔗 winr4r yes
12:54 🔗 winr4r they don't owe anyone else a business model
12:54 🔗 winr4r i mean it's crappy and awful to punch your ecosystem in the face, but it's not antitrust material
12:55 🔗 Baljem well, no, I suspect the entire thing is too frivolous. but it seems to be the same arguments, just a matter of degree
12:56 🔗 Baljem (NB: not thinking about the abusive-monopoly part of the MS antitrust suits, but the unfair-advantage-over-ISVs part)
13:04 🔗 winr4r eh, i don't know
13:05 🔗 * ersi just shrugs and ignores the conversation
13:06 🔗 Baljem yeah, sorry, I just have a weird sense of dry humour at times. I'll shut up ;)
13:07 🔗 * winr4r pets Baljem
13:08 🔗 ersi Humour? Anti-trust and abusive-monopolies? Mmmh.
13:09 🔗 Baljem well, my initial comment was a tongue-in-cheek "where have we heard that before?" type joke. it kinda spiralled out of control when I started over-thinking it ;)
13:11 🔗 ersi Right
13:19 🔗 SmileyG Baljem: how much did you pay for twitter again?
13:19 🔗 SmileyG Thats the big thing everyone seems to forget about windows _ALL THE TIME_
13:19 🔗 SmileyG you paid for it, if it came with your PC.
13:20 🔗 ersi You pay with brain cells, when using Tweetarh
13:21 🔗 winr4r is this horse dead yet folks best beat it to make sure
13:21 🔗 ersi Let's
13:21 🔗 winr4r whack whack whack
13:21 🔗 ersi You fuckin' horse!
13:21 🔗 * ersi kicks it about
13:21 🔗 * ersi sends his dwarf army to beat up the horse
13:21 🔗 * GLaDOS horses the dead horse around with a bit of horse
13:22 🔗 GLaDOS YEAH, LETS HORSE THIS DEAD HORSE
13:22 🔗 ersi HORSE IT REAL GOOD
13:22 🔗 Baljem heh. I'm not touching it. I apologise unreservedly for dragging the horse into the ring in the first place ;)
13:22 🔗 * ersi brings another horse into the ring
13:22 🔗 ersi The game is ON! Go living horse, beat that dead horse!
13:22 🔗 Baljem is it pre-dead?
13:22 🔗 Baljem oh
13:24 🔗 Baljem I don't think your horse understood your instruction, ersi :(
13:25 🔗 ersi Yeah, it killed itself
13:26 🔗 Baljem skills!
13:26 🔗 ersi so now, two dead horsies
13:26 🔗 GLaDOS It was all like "NEIGH I'M A HORSIE" and then died.
13:26 🔗 GLaDOS Rather spectacular
13:27 🔗 winr4r spooky thing: there's a drama on TV *right now* that i'm not following, and someone just euthanised his horse
13:27 🔗 Baljem apropos of nothing, I have just learned of the existence of 'Fizzy Pepsi Cola' flavoured corn snacks. uh, what.
13:27 🔗 winr4r so how about that
13:27 🔗 Baljem is that a euphemism for something, winr4r? ;)
13:27 🔗 SketchCow A horse is a horse, euthanized of course
13:28 🔗 Baljem "excuse me a minute, I'm just off to, uh, euthanise my horse, if you know what I mean"
13:28 🔗 SketchCow And nobody shoots a lame horse of course
13:28 🔗 winr4r morning SketchCow
13:28 🔗 godane hey SketchCow
13:28 🔗 godane i'm starting to upload more linux format dvds
13:29 🔗 SketchCow Bed delivery here in NYC
13:29 🔗 SketchCow Before heading north
13:30 🔗 SketchCow We turned the heat up on the internet archive digitizer upstate
13:33 🔗 ersi Hmmm, wouldn't one say "zed" to the Z character? Not "C" [see]? Heard several say "see"/"C" instead of zed ;o
13:33 🔗 winr4r ersi: "zee" in american english
13:33 🔗 winr4r "zed" is british/commonwealth english
13:33 🔗 ersi Wow, alright
13:34 🔗 ersi 'cause "zee" sounds a lot more like "C"
13:34 🔗 godane SketchCow: i'm going to be uploading giantbombcast podcast
13:34 🔗 godane one of the guys died at 34
14:00 🔗 winr4r hey, does anyone know what format this is in: http://archive.org/details/2013_common_crawl_index_urls
14:00 🔗 winr4r when uncompressed
14:01 🔗 winr4r (can't download 21gb of stuff here)
14:02 🔗 ersi I'm sure ivan` does, since he made it :)
14:02 🔗 ersi I guess it's just a textfile with URL\nURL\n
14:03 🔗 winr4r ah okay
14:04 🔗 winr4r i'm writing a page on teh wiki, i'd like to know for sure
14:05 🔗 Baljem hmm. is there a trick to downloading that file with curl? I can grab it from the browser but I don't want the whole thing, obviously, just want to pipe it into bunzip2 | head...
14:07 🔗 winr4r curl --no-buffer thing | bunzip2 | head ?
14:09 🔗 Baljem ah, got it - right-click, copy link gives a redirector link, I think, grabbed the actual file URL from the Web Inspector and that worked
14:09 🔗 winr4r oh! i misunderstood the question
14:09 🔗 Baljem want me to paste the first few lines in query as an example, winr4r, or is that not going to be particularly useful?
14:10 🔗 SmileyG pastebin that stuff babbbby
14:10 🔗 winr4r Baljem: yup, pastebin pl0x
14:12 🔗 Baljem http://pastebin.com/zfqwbPRW
14:12 🔗 Baljem sorry for the delay, brain went dead and I couldn't remember where the ArchiveTeam pastebin thingy was - so used the boring one ;)
14:13 🔗 SmileyG those look weird.
14:13 🔗 GLaDOS ..it's flipped them around
14:13 🔗 SmileyG but just split on :
14:13 🔗 SmileyG GLaDOS: non.
14:13 🔗 winr4r hm!
14:14 🔗 GLaDOS what
14:14 🔗 winr4r maybe the ones at the top are malformed ones
14:14 🔗 SmileyG actually maybe it has
14:14 🔗 winr4r and the rest are all normal
14:14 🔗 SmileyG winr4r: ones at the top are mailto's
14:15 🔗 winr4r SmileyG: right, but if it's *normally* flipped around, then why would there only be two .coms at the top?
14:16 🔗 ersi lool, why does it start with tld and end with protocol
14:16 🔗 ersi shrug
14:17 🔗 Baljem shades of JANET addressing there!
14:17 🔗 winr4r ersi: yeah, that makes me think that those ones are outliers in the set
14:17 🔗 winr4r at least on the count of starting with the tld
14:18 🔗 ersi haha, nice URLs :D #15 -> http://www.meatmembers.com (Yes, it's as NSFW as it sounds)
14:18 🔗 Baljem I was disappointed by the apparent content of http://www.bindher.com ;)
14:18 🔗 SmileyG dear lord :D
14:18 🔗 * SmileyG doesn't click either.
14:19 🔗 GLaDOS That's why the second one sucks.
14:19 🔗 GLaDOS "<meta name="generator" content="Quick 'n Easy Web Builder - http://www.quickandeasywebbuilder.com">"
14:19 🔗 SmileyG Anyone want to help me moving 50kg racks containing 30kg powered on, live servers?
14:19 🔗 winr4r SmileyG: yes
14:19 🔗 GLaDOS SmileyG: yeah sure why not
14:19 🔗 Baljem OK, a handful taken from around line 2,500,000 look the same
14:20 🔗 Baljem e.g. "ar.com.elsurhoy/:http"
14:20 🔗 SmileyG don't ask why the racks weigh so much, no one seems to be able to answer other than "Like their servers, dell racks weigh far more than thought possible).
14:20 🔗 winr4r Baljem: hmm
14:20 🔗 SmileyG someone needs to for x in ./list; do wget $x; done these.
14:20 🔗 * SmileyG looks at underscor
14:20 🔗 Baljem mind you, line 2,500,000 is only 20.2MB into the file
14:21 🔗 * winr4r does magic spell, summons underscor
14:21 🔗 winr4r Baljem: yes, exactly
14:21 🔗 SmileyG yey my S3 upload is 2% done.
14:21 🔗 SmileyG use this to build markov chains. Instantly know all websites ever?
14:21 🔗 Baljem the machine I was doing that from pull it down at about 1MB/sec so if we wanted to pick a suitably huge number I could try again... just can't save it because it doesn't have enough disk space
14:22 🔗 winr4r Baljem: thanks for helping anyway :)
14:22 🔗 Baljem holy shit, 28MB free on /? I need to shuffle some VMs around. turn the old host crate into a Warrior machine perhaps
14:22 🔗 winr4r i think it may well be backwards as you suspect
14:25 🔗 Baljem would seem a curious format to provide results in though
14:26 🔗 Baljem well, apart from it allowing sorting by TLD, that is
14:26 🔗 Baljem ... whether that's a useful thing to make easy is another matter
14:32 🔗 winr4r Baljem: i'm throwing together a set of tips for finding sites on a soon-to-be-dead host
14:32 🔗 winr4r (related to what i threw up on github earlier)
14:32 🔗 winr4r so, making it easy is not a concern
14:32 🔗 Baljem *nods*
14:33 🔗 Baljem I was just trying to think of motivations for reversing the FQDN, really, and coming up short :)
14:33 🔗 winr4r ohh, gotcha
14:34 🔗 winr4r sorry misunderstood
14:34 🔗 Baljem I really need to stop musing out loud *grin*
14:35 🔗 Baljem some sets of hints and tips certainly sound useful though. when I got involved with you crazy lot a while back, it was a case of pestering people for help and hoping not to piss anyone off with total ineptness
14:35 🔗 ersi Probably for sorting on TLD, like you said earlier - I wouldn't find that super weird, albeit maybe a little
14:36 🔗 ersi Or maybe the common crawl people got Java in their brains and made them look like packages
14:36 🔗 Baljem yeah, it just seems a bit of an edge case to optimise for
14:36 🔗 ersi com.oracle.utils.lols.FinderCakeFactory
14:36 🔗 Baljem heh, yes
14:37 🔗 winr4r ersi: yup
14:37 🔗 ersi or maybe ivan` sorted it that way :) I know he did a top-list per TLD
14:38 🔗 Baljem right, line 50,000,000 is "be.zoover.www/griekenland/lefkas-levkas/hortata/sitemap:http"
14:38 🔗 Baljem I think it's probably safe to assume there aren't that many 'weird' entries, and that is in fact normal
14:38 🔗 Baljem (that corresponds to about 392MB into the file)
14:38 🔗 winr4r Baljem: thanks man :)
14:39 🔗 winr4r Baljem: 392mb into the uncompressed or compressed one?
14:39 🔗 Baljem compressed
14:39 🔗 winr4r Baljem: alright yeah, it probably is backward-ordered then
14:39 🔗 Baljem so approximately the 2% mark
14:39 🔗 winr4r thanks :)
14:39 🔗 Baljem no worries. it was a welcome distraction from the horrors of Other People's Code ;)
14:40 🔗 winr4r haha
14:41 🔗 Baljem seriously. just found a file full of MFC code, with a comment buried in the middle "this is a workaround for an issue with Borland Turbo C++ 1.0"
14:41 🔗 Baljem ... wat.
14:41 🔗 winr4r wat
14:42 🔗 Baljem ... it was supposedly written in 2003!
14:42 🔗 Baljem it doesn't help that this project wasn't under source control until mid-2005 when I took over from the ex-boss :(
14:43 🔗 winr4r :(
14:59 🔗 winr4r http://www.archiveteam.org/index.php?title=Site_exploration
15:04 🔗 SmileyG good job :)
15:06 🔗 winr4r i'll probably clean that bing script up into something worth releasing soon enough
15:11 🔗 ivan` ersi: Baljem: the commoncrawl data is unmodified (using their strings as-is)
15:11 🔗 godane i found a lost episode of CBC Spark: https://archive.org/details/spark_20070919_3346
15:11 🔗 ivan` I have a Python script that turns it into real URLs but note their data is sometimes ambiguious
15:12 🔗 ersi ivan`: ah :)
15:12 🔗 ersi ivan`: See any point in their format?
15:13 🔗 ivan` it is useful for sorting domains properly but they did not do it right (can't tell port apart from :n at the end of a URL)
15:16 🔗 winr4r ivan`: can you shove that script into paste.archivingyoursh.it please? :)
15:16 🔗 winr4r godane: good job :)
15:17 🔗 ivan` I just pasted to https://www.refheap.com/16448/raw
15:17 🔗 ivan` Baljem: ^
15:17 🔗 ivan` hastebin is bad software and I try to avoid it
15:18 🔗 winr4r yeah, but on the other hand, it's run by us
15:18 🔗 winr4r or rather, GLaDOS i think
15:18 🔗 ersi indeed
15:19 🔗 ersi ivan`: Guess they assume everything either runs on :80 or :443 and just care about the end being http:// or https://
15:19 🔗 ivan` https://github.com/trivio/common_crawl_index/issues/12
15:19 🔗 ivan` there are other bugs
15:20 🔗 ivan` https://github.com/trivio/common_crawl_index/issues
15:20 🔗 Baljem ivan: nice - I was just fiddling with a really nasty Perl one-liner to do it, but wasn't liking it at all
15:21 🔗 ivan` no warranty, doesn't handle some of their broken stuff
15:21 🔗 ivan` at least it doesn't crash ;)
15:22 🔗 Baljem well, it saves me making a pillock of myself by suggesting a hack that only handles the test cases I threw at it ;)
15:23 🔗 winr4r ivan`: thanks, updated the page :)
15:35 🔗 godane we may have a problem: http://torrentfreak.com/shut-down-the-pirate-bay-founder-says-130708/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+Torrentfreak+%28Torrentfreak%29
15:37 🔗 DFJustin archive.org does the same backwards thing with domain names if you look at the index files they generate https://archive.org/download/archive.pdp11.org.ru-20130504/archive.pdp11.org.ru-20130504.cdx.idx
15:37 🔗 DFJustin they seem to automatically fold example.com and www.example.com into the same wayback machine entry so it probably facilitates that
15:38 🔗 winr4r oh uh
15:39 🔗 winr4r we've *had* a problem: apparently piczo actually disappeared
15:39 🔗 winr4r piczo.com returns a "we're dead", but subdomains (on which the sites were hosted, and where all the shit was) still work
15:40 🔗 winr4r and there's lots of other pages that still seem to work
15:40 🔗 winr4r (piczo: like geocities but more so)
15:42 🔗 winr4r or "social geocities"
15:42 🔗 Baljem "There.s actually been more downtime for the site due to drunk admins, than downtime due to raids." <- heh. my servers always wait until I'm well oiled to demand attention, too
15:42 🔗 Baljem is piczo another one we only hear about on the day it actually dies? :(
15:43 🔗 winr4r Baljem: it was announced a while back as "closing down soon"
15:43 🔗 Baljem ah, OK. first I'd heard of the site was someone linking to that 'closing soon' blog post earlier today
15:44 🔗 winr4r yeah, that was me
15:44 🔗 winr4r it actually happened earlier this year some time
15:45 🔗 winr4r they seem to be keeping the sites themselves up though
15:45 🔗 winr4r for how long, who knows
15:47 🔗 winr4r https://www.facebook.com/Piczo
15:47 🔗 winr4r the posts here are a fascinating mix of "nooooo why did you close piczo" and "please delete my site" (aka "oh shit, this still shows in search results for my name")
15:55 🔗 Baljem I'm not sure whether to be worried that, scrolling down that page, the thing that jumped out at me was "is that a Canonet she's using?"
15:56 🔗 Baljem but I think it isn't because I don't see a photocell at the top of the lens. bah.
15:56 🔗 winr4r https://fbcdn-sphotos-f-a.akamaihd.net/hphotos-ak-ash4/418000_10150556564238479_643127632_n.jpg ?
15:56 🔗 winr4r hm, good question
15:56 🔗 Baljem yep
15:57 🔗 winr4r definitely late-60s/early-70s rangefinder styling
15:57 🔗 Baljem it looks a lot like the Canonet 28 I just refurbed, except that would have a chrome barrel and I think a wider RF glass
15:57 🔗 winr4r take off the lens and it could be an olympus trip 35, but the writing on the lens is too big
15:57 🔗 winr4r oh!
15:58 🔗 winr4r olympus 35 RC
15:58 🔗 winr4r compare: http://kenrockwell.com/olympus/images/35rc/D3S_6137-1200.jpg
15:58 🔗 Baljem yes. nice spot!
15:58 🔗 winr4r fuck, we're nerds :(
15:59 🔗 Baljem haha. I did say it could be a worry *grin*
16:45 🔗 godane uploaded: https://archive.org/details/cdrom-linuxformatmagazine-80
16:47 🔗 Baljem huge piles of <3 to whoever decided the Wayback Machine needed to take a nibble at Adaptec's downloads site -- it's been offline for months, and I only just thought to look at archive.org for the updated drivers...
16:49 🔗 * DFJustin salutes the kindly wayback robots
16:57 🔗 Baljem welp. turns out it wasn't my imagination playing tricks on me - my RAID array is actually degraded, despite the controller BIOS claiming the drives are healthy and the mirror is resyncing
16:57 🔗 winr4r :<
16:57 🔗 Baljem now I've got the Storage Manager installed again, it tells me drive 1 has failed and the array is degraded. *thumbs up*
16:57 🔗 Baljem might be a good excuse to upgrade from a pair of 1TB drives though
16:58 🔗 Baljem aww crap, if I'd realised that a fortnight ago I could have snuck it into the last financial year for general taxation goodness
16:59 🔗 winr4r aw
17:34 🔗 balrog http://www.ibtimes.co.uk/articles/487583/20130708/valve-hardware-ellsworth-management-firing-half-life.htm
17:40 🔗 balrog http://jenesee.com/?p=941
17:41 🔗 winr4r hey, it's jeri ellsworth
17:42 🔗 winr4r jeri is pretty awesome by any measure
18:39 🔗 godane win4r: Here is Jeri Ellsworth interview on Triangulation: https://archive.org/details/Triangulation_3
18:59 🔗 godane the new GTA V game looks awesome
19:02 🔗 underscor what's a fortnight again?
19:02 🔗 underscor two weeks?
19:02 🔗 godane yes
19:06 🔗 ersi I wouldn't want to stay in a fort, over night
19:07 🔗 winr4r godane: hey thanks!
19:08 🔗 winr4r i really do admire jeri ellsworth
19:08 🔗 winr4r any company culture that ends up getting rid of a jeri ellsworth is really, totally fucked and doomed
19:18 🔗 godane now this is interesting: http://www.ebay.com/itm/Vintage-Apple-Computer-Power-Mac-G4-In-Store-Demo-CD-ROM-v-2-Holiday-1999-Tower-/290713049363?hash=item43afd91913
19:19 🔗 godane its sadly $109.99
19:41 🔗 godane SketchCow: IBM PC Demo Reach For The Skies By Virgin: http://www.ebay.com/itm/IBM-PC-Demo-Reach-For-The-Skies-by-Virgin-/400515188547
19:42 🔗 godane i think you would want to get it
19:46 🔗 DFJustin http://archive.org/details/ReachfortheSkies_1020
21:03 🔗 joepie91 SketchCow: if you ever plan on giving a talk somewhere in the Netherlands (or Belgium), be sure to let me know
21:04 🔗 ersi joepie91: Are you coming to OHM?
21:11 🔗 ersi http://www.youtube.com/watch?v=_nzD-QpmePE
21:19 🔗 arkhive http://torrentfreak.com/shut-down-the-pirate-bay-founder-says-130708/
21:23 🔗 ersi Well, I think I understand what he's saying and I guess I sort of agree with him
21:23 🔗 ersi But it's not shutting down just yet
21:24 🔗 balrog he's saying that it needs to be replaced
21:29 🔗 ersi In order for something better, more secure, to pop up and really get developed and continously maintained
21:29 🔗 ersi Well, I guess I agree. But he's not involved with the operational part of TPB anymore, so it's just a statement - not a "We're shutting down!"-notice
21:37 🔗 joepie91 ersi: I wasn't planning to, why?
21:38 🔗 joepie91 arkhive: I think he's missing a vital point.
21:39 🔗 joepie91 being that by shutting down TPB, it would just drive traffic to other sites.
21:39 🔗 joepie91 using their audience to promote an alternative would be much much more effective
21:40 🔗 dashcloud I think they want something to grow or promote itself out of the void
21:40 🔗 joepie91 goes to show that ideas are not always compatible with reality
21:41 🔗 joepie91 the chance of something just magically appearing with no backing at all and gaining a critical mass while no flaws in the current system are obvious to the majority of users
21:41 🔗 joepie91 is close to zero
21:41 🔗 joepie91 "why would I use that complicated thing? TPB works fine!"
21:41 🔗 joepie91 if TPB goes down
21:41 🔗 joepie91 "why would I use that complicated thing? kat.ph works fine!"
21:41 🔗 joepie91 etc
21:50 🔗 antomatic Have to agree - new things get created and thrive because they're better than the old thing - not because the old thing has gone
21:51 🔗 antomatic No-one believes, for example, that Google could not have come along until Altavista shut down. And we know the reverse is true.
21:54 🔗 antomatic Could probably have phrased that better.
21:55 🔗 antomatic "Google didn't happen just because AltaVista shut down."
22:05 🔗 ivan` existing supply can crowd out new supply
22:05 🔗 ivan` people rightly perceive switching costs
22:09 🔗 SketchCow Here's a conversation I've gotten sick of.
22:10 🔗 SketchCow 1. "We should scan in _____ because all the online copies suck."
22:10 🔗 SketchCow 2. 'But wait! Copyright! We will all go to jail."
22:10 🔗 SketchCow 3. "You're right, which is a shame, I have all the copies right here."
22:10 🔗 SketchCow 4. <5000 lines about how to scan things "right">
22:10 🔗 SketchCow Rather sick of said conversation.
22:14 🔗 joepie91 SketchCow: whenever I end up in such a conversation, I usually just say "hey, I have this nice VPS in Romania where the host doesn't give a shit about what's hosted on it, if you'll just give me a pile of scans I'll make sure they end up somewhere nice"
22:14 🔗 joepie91 it works most of the time.
22:56 🔗 underscor Man, there really is something nice about a file downloading at 27MBps at your desk
22:57 🔗 underscor Makes a sexy spike in the bandwidth graph for the switch, too
22:57 🔗 DFJustin >:(
23:00 🔗 underscor <3

irclogger-viewer