#archiveteam-bs 2015-09-20,Sun

↑back Search

Time Nickname Message
00:03 πŸ”— arkiver2 has quit IRC (Client Quit)
00:55 πŸ”— JesseW has quit IRC (Read error: Operation timed out)
00:57 πŸ”— superkuh_ has quit IRC (Read error: Operation timed out)
00:57 πŸ”— superkuh_ has joined #archiveteam-bs
01:11 πŸ”— Stiletto has joined #archiveteam-bs
01:12 πŸ”— Stilett0 has quit IRC (Read error: Operation timed out)
01:14 πŸ”— JesseW has joined #archiveteam-bs
01:20 πŸ”— JesseW has quit IRC (Read error: Operation timed out)
01:29 πŸ”— primus105 has quit IRC (Leaving.)
02:18 πŸ”— dashcloud has quit IRC (Read error: Operation timed out)
02:24 πŸ”— dashcloud has joined #archiveteam-bs
03:19 πŸ”— JesseW has joined #archiveteam-bs
04:27 πŸ”— primus104 has joined #archiveteam-bs
04:38 πŸ”— aaaaaaaaa has quit IRC (Leaving)
05:30 πŸ”— primus104 has quit IRC (Leaving.)
06:02 πŸ”— anomie I'm arguing with someone about url shorteners…
06:28 πŸ”— yipdw sounds like a bad argument to get into
06:36 πŸ”— JesseW has quit IRC (Read error: Operation timed out)
06:37 πŸ”— anomie Yeah.
06:37 πŸ”— anomie He thought it was better that he was using his own url shortener… -_-
06:48 πŸ”— wp494 AKA his own pile of dog shit that he'll just kill because at some point it will become too hard to properly maintain
06:49 πŸ”— wp494 or he realizes it was a stupid idea in the first place to even try it
06:50 πŸ”— anomie Yeah. That's what I tried to tell him.
06:50 πŸ”— anomie Oh well. Archiveteam exists because such people are a fact of life, I suppose.
06:52 πŸ”— anomie To be fair, it's a little low on the list of offenses one can commit against archivists.
06:53 πŸ”— bentpins Heard of the QR codes on headstones trend? That's a grave offense
06:55 πŸ”— anomie What are headstones?
06:56 πŸ”— anomie Also, is there an actual list of such offenses? I think that'd be neat.
06:56 πŸ”— bentpins The stone bit in a cemetery that goes above the body with the details on the person that died
06:58 πŸ”— anomie Oh…
06:59 πŸ”— anomie Dear god, why?
07:00 πŸ”— bentpins There's an article on it, but I mainly mentioned it for the pun. http://www.theatlantic.com/technology/archive/2014/05/qr-codes-for-the-dead/370901/
07:01 πŸ”— anomie Dear god… doesn't this kinda defeat the purpose of a headstones?
07:02 πŸ”— bentpins Yeah it just seems like someone totally missed the point
07:02 πŸ”— anomie Yeah.
07:02 πŸ”— Ctrl-S the point of headstone is basically so you know what grave is what
07:02 πŸ”— bentpins wow the one in the article allready 404s
07:02 πŸ”— anomie I mean, part of the reason the headstone is made out of what it's made of, is that it will last a very long time with minimal maintenance, right?
07:03 πŸ”— anomie bentpins: … there are no words for the antipathy I feel right now
07:04 πŸ”— anomie I found this old HN thread. Many of the URL shorteners mentioned don't exist anymore. https://news.ycombinator.com/item?id=508132
07:05 πŸ”— PurpleSym has joined #archiveteam-bs
07:06 πŸ”— anomie That being said, is there a "better" way to run a url shortening service?
07:07 πŸ”— Ctrl-S make it open source with a way to download the whole thing
07:07 πŸ”— anomie Despite their abuse, I can still imagine them being used urls on paper.
07:07 πŸ”— anomie *useful for
07:07 πŸ”— Ctrl-S so you can clone it trivially
07:07 πŸ”— anomie Ctrl-S: Fair point.
07:07 πŸ”— anomie I mean, a url database isn't that large.
07:08 πŸ”— Ctrl-S so you can grab the codebase from their repo and a daily dump of the DB
07:08 πŸ”— anomie I'm not a fan of the blockchain buzzword, but maybe that could be used effectively.
07:10 πŸ”— bentpins I'm suprised shortners haven't been going rouge and running drive by downloads and MITM attacks
07:11 πŸ”— anomie I am too.
07:11 πŸ”— anomie I'm guessing it might be because very few become really popular.
07:12 πŸ”— bentpins probably
07:15 πŸ”— anomie bentpins: Wow. I know it's just for a cat, but I'd at least expect the qr code to be engraved. http://cdn.theatlantic.com/assets/media/img/posts/2014/05/pet_memorial_qr_code/25b9ea967.jpg
07:16 πŸ”— GLaDOS anomie: easier to update, i guess?
07:17 πŸ”— anomie >updating a headstone
07:17 πŸ”— anomie Uhmm…
07:17 πŸ”— bentpins hah
07:17 πŸ”— GLaDOS ok look, when that page goes down, how else will they rehost it?
07:18 πŸ”— bentpins Or when the domain http://www.foreverheadstone.com/ expires and gets bought...
07:19 πŸ”— GLaDOS >credit mistakes
07:19 πŸ”— GLaDOS just exactly what somebody reminiscing needs to be reminded about
07:19 πŸ”— GLaDOS "HEY REMEMBER THAT TIME YOU BOUGHT THAT EXPENSIVE CAT PLAY TOY ON YOUR CREDIT CARD AND NEVER PAID IT OFF? WE DO TOO"
07:23 πŸ”— anomie There is no distributed url shortener, it seems.
07:23 πŸ”— anomie Maybe I'll make one myself, if I can motivate myself properly.
07:24 πŸ”— GLaDOS really? with all of the blockchains i would've assumed somebody would've made one..
07:26 πŸ”— anomie I know.
07:26 πŸ”— anomie I mean… the need for one seems obvious.
07:26 πŸ”— anomie Distributed social networks are infinitely more complicated than this, yet there are plenty of those.
07:31 πŸ”— GLaDOS i guess getting people to actually use it would be a bit of a hassle, if something needs to be installed
07:31 πŸ”— GLaDOS because the link would be useless without said program
07:33 πŸ”— bentpins The way I was picturing it it's just a distributed keystore. Then you have site operators that do the 301 bit. That way if one goes down you can just choose another operator to prefix links with
07:36 πŸ”— GLaDOS that could also work
07:39 πŸ”— PurpleSym Without some sort of rewriting plugin the links to operator A would still be dead for most people.
07:40 πŸ”— bentpins True, but it would still put #urlteam out of a job
07:41 πŸ”— GLaDOS we could also store other shortener rewrites in the keystore..
07:41 πŸ”— GLaDOS if a site goes down, someone just has to snatch that domain up and point it at a server with the keystore..
07:41 πŸ”— GLaDOS (configuring it to act like it did before ofc0
07:53 πŸ”— PurpleSym I’m diverting the topic a litte here, but has anybody looked into backing up Yahoo! Groups before?
07:53 πŸ”— PurpleSym Like, all of it.
07:55 πŸ”— PurpleSym With 5.5 million groups and ~8000 messages per group on average that would be 42.5 billion messages to back up.
07:56 πŸ”— PurpleSym Which would a single person 511 years and 477 TB of storage.
07:56 πŸ”— xmc sounds like a lot of error 999
07:57 πŸ”— PurpleSym Nah, that’s with appropriate rate-limits.
07:57 πŸ”— xmc yeah, we've looked in to it before
07:57 πŸ”— xmc yahoo has a pretty aggressive ratelimiter, which returns 999 when it wants you to go away
07:58 πŸ”— PurpleSym I’ve seen that, but waiting 0.38 seconds between the requests usually gets around that limitation.
07:59 πŸ”— xmc not meaning to discourage you, if you want to make a yahoo groups scrape happen then i'm all ears
07:59 πŸ”— xmc there's a lot of important shit in there
07:59 πŸ”— bentpins It's reachable over IPV6, surely if you have even a tiny block that would solve things
07:59 πŸ”— xmc unfortunately a lot of it is membership-restricted
07:59 πŸ”— xmc oh really
07:59 πŸ”— xmc hmm
07:59 πŸ”— PurpleSym No IPv6 on my end, unfortunately.
08:00 πŸ”— xmc luckily most reputable vps providers have it these days
08:00 πŸ”— PurpleSym And yes, half of the groups I discovered so far are members only.
08:05 πŸ”— GLaDOS capturing public groups only is better than doing nothing though..
08:06 πŸ”— PurpleSym So, the biggest problem I had so far is: How do I store the data?
08:06 πŸ”— PurpleSym I’m currently using a mongodb, because that’s the only thing that worked reliably so far.
08:07 πŸ”— GLaDOS PurpleSym: i'd just save as HTML and WARC them up
08:07 πŸ”— GLaDOS that way it's ingestable into the wayback
08:07 πŸ”— GLaDOS ..unless we stopped doing that for some reason
08:08 πŸ”— schbirid has joined #archiveteam-bs
08:08 πŸ”— PurpleSym But there’s a nice API with machine-readable data.
08:08 πŸ”— PurpleSym That’s what I’m scraping right now.
08:22 πŸ”— xmc is there a way you can dump out something that looks like an mbox file?
08:22 πŸ”— xmc i.e. an email message
08:23 πŸ”— PurpleSym Sure, that’s easy.
08:24 πŸ”— PurpleSym The API has the raw message.
08:24 πŸ”— PurpleSym (with email addresses censored)
08:24 πŸ”— xmc kool
08:25 πŸ”— xmc an mbox file per group per month would be a good start then
08:25 πŸ”— bentpins You still get usernames though right?
08:25 πŸ”— PurpleSym There’s just one problem with that: https://yahoo.uservoice.com/forums/209451-us-groups/suggestions/9644478-displaying-raw-messages-is-not-8-bit-clean
08:26 πŸ”— PurpleSym Yes, I think yahoo usernames are in there as well, bentpins
08:28 πŸ”— xmc PurpleSym: that sounds like an issue with the person who sent the email
08:28 πŸ”— PurpleSym Hm, but the HTML version is fine.
08:43 πŸ”— arkiver2 has joined #archiveteam-bs
08:45 πŸ”— yipdw never expected to see "mongodb" and "worked reliably" in association
08:45 πŸ”— yipdw learn something new every day
08:47 πŸ”— HCross yipdw, http://howfuckedismydatabase.com/
08:47 πŸ”— yipdw i've seen that before yes
08:47 πŸ”— PurpleSym Well, I had everything in small files previously. The filesystem did not like that.
08:53 πŸ”— signius has quit IRC (Ping timeout: 306 seconds)
08:53 πŸ”— primus104 has joined #archiveteam-bs
09:06 πŸ”— signius has joined #archiveteam-bs
09:37 πŸ”— godane has quit IRC (Leaving.)
10:13 πŸ”— arkiver2 has quit IRC (Ping timeout: 252 seconds)
10:50 πŸ”— swebb has quit IRC (Read error: Operation timed out)
10:51 πŸ”— Laverne has quit IRC (Read error: Operation timed out)
10:51 πŸ”— lytv has quit IRC (Read error: Operation timed out)
10:52 πŸ”— chazchaz has quit IRC (Read error: Operation timed out)
10:53 πŸ”— Laverne has joined #archiveteam-bs
10:53 πŸ”— aschmitz has quit IRC (Read error: Operation timed out)
10:54 πŸ”— zenguy_pc has quit IRC (Read error: Operation timed out)
10:54 πŸ”— aschmitz has joined #archiveteam-bs
10:54 πŸ”— lytv has joined #archiveteam-bs
10:55 πŸ”— atlogbot has quit IRC (Ping timeout: 369 seconds)
10:58 πŸ”— zenguy_pc has joined #archiveteam-bs
11:00 πŸ”— Laverne has quit IRC (Ping timeout: 369 seconds)
11:03 πŸ”— Laverne has joined #archiveteam-bs
11:04 πŸ”— dashcloud has quit IRC (Read error: Operation timed out)
11:04 πŸ”— swebb has joined #archiveteam-bs
11:04 πŸ”— atlogbot has joined #archiveteam-bs
11:08 πŸ”— dashcloud has joined #archiveteam-bs
11:09 πŸ”— chazchaz has joined #archiveteam-bs
11:45 πŸ”— godane has joined #archiveteam-bs
11:49 πŸ”— Infreq has quit IRC (Read error: Operation timed out)
11:50 πŸ”— Infreq has joined #archiveteam-bs
11:56 πŸ”— robink has quit IRC (Ping timeout: 492 seconds)
11:56 πŸ”— cloudmons has quit IRC (Ping timeout: 492 seconds)
11:57 πŸ”— arkiver2 has joined #archiveteam-bs
12:23 πŸ”— arkiver2 has quit IRC (Ping timeout: 252 seconds)
12:48 πŸ”— cloudmons has joined #archiveteam-bs
12:48 πŸ”— robink has joined #archiveteam-bs
13:24 πŸ”— zenguy_pc has quit IRC (Read error: Connection reset by peer)
13:42 πŸ”— zenguy_pc has joined #archiveteam-bs
14:26 πŸ”— vitzli has joined #archiveteam-bs
14:42 πŸ”— primus104 has quit IRC (Leaving.)
15:25 πŸ”— chfoo has quit IRC (Read error: Operation timed out)
15:28 πŸ”— robink has quit IRC (Read error: Connection reset by peer)
15:28 πŸ”— chfoo has joined #archiveteam-bs
15:34 πŸ”— cloudmons has quit IRC (Ping timeout: 492 seconds)
15:35 πŸ”— primus104 has joined #archiveteam-bs
16:29 πŸ”— cloudmons has joined #archiveteam-bs
16:55 πŸ”— JesseW has joined #archiveteam-bs
17:15 πŸ”— JesseW has quit IRC (Read error: Operation timed out)
17:27 πŸ”— JesseW has joined #archiveteam-bs
17:32 πŸ”— robink has joined #archiveteam-bs
17:43 πŸ”— robink has quit IRC (Read error: Connection reset by peer)
17:44 πŸ”— robink has joined #archiveteam-bs
17:46 πŸ”— vitzli has quit IRC (Quit: Leaving)
17:46 πŸ”— dashcloud has quit IRC (Read error: Operation timed out)
17:50 πŸ”— dashcloud has joined #archiveteam-bs
18:00 πŸ”— arkiver2 has joined #archiveteam-bs
18:13 πŸ”— arkiver2 has quit IRC (Ping timeout: 252 seconds)
18:34 πŸ”— Aranje has quit IRC (Read error: Connection reset by peer)
18:48 πŸ”— schbirid2 has joined #archiveteam-bs
18:48 πŸ”— Aranje has joined #archiveteam-bs
18:52 πŸ”— schbirid has quit IRC (Ping timeout: 306 seconds)
18:54 πŸ”— arkiver2 has joined #archiveteam-bs
18:58 πŸ”— aaaaaaaaa has joined #archiveteam-bs
18:58 πŸ”— Aranje has quit IRC (Ping timeout: 483 seconds)
18:59 πŸ”— arkiver2 has quit IRC (Ping timeout: 252 seconds)
19:07 πŸ”— Aranje has joined #archiveteam-bs
19:09 πŸ”— JesseW has quit IRC (Read error: Operation timed out)
19:18 πŸ”— wyatt874- has joined #archiveteam-bs
19:18 πŸ”— wyatt8740 has quit IRC (Read error: Connection reset by peer)
20:18 πŸ”— JesseW has joined #archiveteam-bs
20:19 πŸ”— Mayonaise has quit IRC (Read error: Operation timed out)
20:22 πŸ”— Mayonaise has joined #archiveteam-bs
20:26 πŸ”— schbirid2 has quit IRC (Quit: Leaving)
20:48 πŸ”— PurpleSym has quit IRC (WeeChat 1.1.1)
20:53 πŸ”— wyatt874- is now known as wyatt8740
21:18 πŸ”— zenguy_pc has quit IRC (Ping timeout: 483 seconds)
21:26 πŸ”— zenguy_pc has joined #archiveteam-bs
21:34 πŸ”— JesseW has quit IRC (Read error: Operation timed out)
21:54 πŸ”— zenguy_pc has quit IRC (Ping timeout: 483 seconds)
22:01 πŸ”— RichardG has quit IRC (Remote host closed the connection)
22:02 πŸ”— RichardG has joined #archiveteam-bs
22:03 πŸ”— zenguy_pc has joined #archiveteam-bs
23:12 πŸ”— dashcloud has quit IRC (Read error: Operation timed out)
23:15 πŸ”— dashcloud has joined #archiveteam-bs
23:15 πŸ”— zenguy_pc has quit IRC (Ping timeout: 483 seconds)
23:20 πŸ”— zenguy_pc has joined #archiveteam-bs
23:40 πŸ”— zenguy_pc has quit IRC (Remote host closed the connection)
23:42 πŸ”— zenguy_pc has joined #archiveteam-bs
23:44 πŸ”— zenguy_pc has quit IRC (Read error: Connection reset by peer)

irclogger-viewer