#archiveteam-bs 2016-05-17,Tue

↑back Search

Time Nickname Message
00:13 🔗 SketchCow So, the amount of support scripts I've written to ingest Hiphop Mixtapes cleanly and dependably from 18,000 torrent files has crossed a line into "I feel bad getting even slightly paid for this."
00:14 🔗 SketchCow Mostly, for those who want the post-mortem, it's because I was a little dumb and I used a second, easier source of mixtapes, and the second source KIND of sucks ass.
00:14 🔗 SketchCow It works when it works and does not when it doesn't.
00:14 🔗 SketchCow So I had to write something more resilient for source 1 (and I did)
00:15 🔗 SketchCow But now it has to go "1. Did I already add this. If so, delete. 2. Is it uploaded from the other place? Set aside to merge them. 3. Send off to the outbox to be uploaded."
00:53 🔗 ndizzle has quit IRC (Read error: Connection reset by peer)
01:15 🔗 ndiddy has joined #archiveteam-bs
01:35 🔗 VADemon has joined #archiveteam-bs
02:03 🔗 atrocity has quit IRC (Read error: Operation timed out)
03:23 🔗 VADemon has quit IRC (Quit: left4dead)
03:32 🔗 JesseW has joined #archiveteam-bs
04:36 🔗 bsmith093 has quit IRC (Ping timeout: 244 seconds)
04:40 🔗 Sk1d has quit IRC (Ping timeout: 194 seconds)
04:46 🔗 BlueMaxim has joined #archiveteam-bs
04:46 🔗 Sk1d has joined #archiveteam-bs
04:52 🔗 BlueMaxim has quit IRC (Quit: Leaving)
04:55 🔗 BlueMaxim has joined #archiveteam-bs
05:17 🔗 JesseW has quit IRC (Ping timeout: 370 seconds)
05:25 🔗 JesseW has joined #archiveteam-bs
05:40 🔗 bsmith093 has joined #archiveteam-bs
05:42 🔗 Honno has joined #archiveteam-bs
06:12 🔗 vitzli has joined #archiveteam-bs
06:42 🔗 JesseW has quit IRC (Ping timeout: 370 seconds)
07:20 🔗 metalcamp has joined #archiveteam-bs
08:01 🔗 schbirid has joined #archiveteam-bs
08:08 🔗 ndiddy has quit IRC (Read error: Operation timed out)
08:47 🔗 bsmith093 has quit IRC (Ping timeout: 499 seconds)
08:59 🔗 bsmith093 has joined #archiveteam-bs
09:08 🔗 dashcloud has quit IRC (Read error: Operation timed out)
09:11 🔗 dashcloud has joined #archiveteam-bs
09:47 🔗 bwn has quit IRC (Read error: Operation timed out)
09:56 🔗 marvinw has quit IRC (Quit: Leaving)
10:22 🔗 marvinw has joined #archiveteam-bs
10:43 🔗 marvinw has quit IRC (Quit: Leaving)
10:46 🔗 marvinw has joined #archiveteam-bs
11:44 🔗 bwn has joined #archiveteam-bs
12:42 🔗 toad2 has quit IRC (Read error: Operation timed out)
12:43 🔗 toad1 has joined #archiveteam-bs
12:44 🔗 BlueMaxim has quit IRC (Quit: Leaving)
12:49 🔗 HCross "Stop running local news index web pages, offering instead an open stream on the rolling Local Live service" fucking hell. What the heck BBC
12:54 🔗 Frogging Wha
12:55 🔗 Frogging What does that mean
12:57 🔗 midas it means that bbc said, well no you guys dont need your own website, we can do that for you and make sure WE decide what kind of news you see and delete it after X days
12:57 🔗 midas because storage.
12:57 🔗 Frogging I see
13:15 🔗 HCross we should start a lot of BBC crawls
13:22 🔗 atrocity has joined #archiveteam-bs
13:27 🔗 dashcloud has quit IRC (Read error: Operation timed out)
13:30 🔗 dashcloud has joined #archiveteam-bs
14:44 🔗 GLaDOS just have one long continuous BBC crawl
14:47 🔗 VADemon has joined #archiveteam-bs
14:47 🔗 Rotab https://www.backblaze.com/blog/hard-drive-reliability-stats-q1-2016/
15:09 🔗 JesseW has joined #archiveteam-bs
15:33 🔗 Medowar Sorry to say, but these backblaze Drivestats are nice to have but produce very little actual Information, since their approach is using consumer-grade Drives for datacenter operation. These Drives were never developed for runnin 24/7 and are not designed to run at the temperature and Vibrations, that occour in that specific usage.
15:35 🔗 dashcloud has quit IRC (Read error: Operation timed out)
15:36 🔗 Medowar As an ex-Datacenter-Op, these rates are way higher, than what we observed. We had many WDs, with an annual failure rate of <0,5% and we had way more load on them than backblaze(write once, read once in a while)
15:37 🔗 JesseW has quit IRC (Ping timeout: 370 seconds)
15:39 🔗 dashcloud has joined #archiveteam-bs
16:24 🔗 schbirid has quit IRC (Ping timeout: 258 seconds)
16:50 🔗 phillipsj SketchCow, not every experiment pans out.
17:01 🔗 Kazzy Medowar: were you running enterprise drives?
17:14 🔗 yipdw oh interesting http://archive.org/download/liveweb-20160509032438
17:14 🔗 yipdw er wait never mind
17:15 🔗 yipdw that's been listable for a long time, I thought the download status changed
17:16 🔗 anomie_ has quit IRC (Read error: Operation timed out)
17:24 🔗 vitzli will, "I'm not Edge, I promise!"
17:24 🔗 will haha
17:24 🔗 will re: BBC recipes, can't simply iterate over the ID's
17:30 🔗 phillipsj I sent the website an e-mail. They at least list that on thioer contact page.
17:31 🔗 yipdw user-agents form so little of an HTTP request that at this point you might as well use it as a place to have some fun and acknowledge lineage
17:31 🔗 phillipsj The verboten string was "Lynx/2.8.8dev.12 libwww-FM/2.14 SSL-MM/1.4.1 GNUTLS/2.12.18"
17:33 🔗 * phillipsj tries the Edge string
17:35 🔗 phillipsj ...it worked!. I made sure to accept the cookie so they know I changes user-agent strings.
17:39 🔗 yipdw www.amazon.com/Avoid-Huge-Ships-John-Trimmer/dp/0870334336/ is incredible on several levels
17:39 🔗 yipdw I don't know if "machine learning" is an artistic medium but oh my god it should be
17:46 🔗 JW_work is that page archived? (I presume, but good to check). The reviews are delightful.
17:47 🔗 GLaDOS it is now
17:47 🔗 GLaDOS hopefully
17:48 🔗 JW_work has quit IRC (Quit: Leaving.)
17:49 🔗 phillipsj "There is one major oversight in this generally well-written book, and that is that it addresses animate readers exclusively. As a large rock in the Tyrrhenian Sea off the coast of Giglio Island, I have recently been confronted with instances in which avoiding huge ships was of fundamental interest to my personal well-being. However, the methods presented in Capt. Trimmer's book were none too useful in my efforts to avoid huge
17:49 🔗 phillipsj ships, as I was recently struck by a very large ship indeed, a cruise vessel called the 'Costa Concordia'. ..."
17:50 🔗 yipdw heh yeah
17:58 🔗 vitzli has quit IRC (Quit: Leaving)
18:05 🔗 JW_work has joined #archiveteam-bs
18:06 🔗 phillipsj The problem with recommendations as an art-form is that they are personalized. No idea why I get "Amazon.com Gift Card in a Greeting Card (Various Designs), Images You Should Not Masturbate To, Hutzler 571 Banana Slicer, Amazon.com Gift Card in Gift Box Reveal (Classic Black Card Design)"
18:07 🔗 yipdw it's not too bad; there are many media that are individualized to the viewer
18:08 🔗 yipdw there is also the opportunity to establish a single point of reference with a private browsing session at a known location etc
18:08 🔗 yipdw FWIW, I get the same recommendations
18:08 🔗 phillipsj I used lynx with no cookies set.
18:08 🔗 yipdw I guess this is an outgrowth of those multimedia installations
18:29 🔗 phillipsj User-agent strings should not be used for detections anyway. Browsers advertise what they can accept. The user Agent string is for troubleshooting if a client is acting weird.
19:05 🔗 atrocity http://www.bbc.com/news/uk-36308976
19:05 🔗 atrocity not sure if anybody saw that yet
19:22 🔗 closure has joined #archiveteam-bs
19:49 🔗 Medowar Kazzy: Yes, everything from blue, red, black and gold. Depending on the use case
19:57 🔗 ndiddy has joined #archiveteam-bs
20:27 🔗 w0rp I'd definitely like to see a BBC Food backup.
20:27 🔗 w0rp The archive might be potentially tasty.
20:42 🔗 ItsYoda has joined #archiveteam-bs
20:47 🔗 Stiletto has quit IRC ()
21:00 🔗 Honno has quit IRC (Read error: Operation timed out)
21:05 🔗 yipdw every day an adventure with python https://gitlab.peach-bun.com/snippets/13
21:12 🔗 joepie91 Medowar: Backblaze's use of consumer drives is intentional
21:12 🔗 joepie91 they produce a lot of information, just about a different class of drives than you are using
21:13 🔗 joepie91 specifically, their reason for using consumer drives is that the TCO is lower - higher failure rate but lower drive cost and higher availability of fresh drives
21:13 🔗 joepie91 so they build a fault-tolerant infrastructure instead
21:14 🔗 joepie91 and just replace much cheaper drives a little more often
21:14 🔗 joepie91 (this is documented in one of their earlier blog posts, iirc)
21:14 🔗 yipdw that's my reason for using consumer drives
21:14 🔗 yipdw also so I can occasionally act indignant over getting a DOA drive in a batch of four
21:14 🔗 yipdw maybe that's just newegg
21:16 🔗 Medowar joepie91: I know, still consumer drives arent made for the physical tasks of datacenter operation and running 24/7.
21:17 🔗 joepie91 Medowar: I'm sure. does it matter?
21:17 🔗 Medowar So Backblaze is not actually producing any valuable information.
21:17 🔗 joepie91 ... yes, they are.,
21:17 🔗 joepie91 they just aren't producing the specific information that -you- want.
21:17 🔗 joepie91 their data does not become any less correct or valuable.
21:18 🔗 Medowar then for who exactly is this valuable?
21:18 🔗 joepie91 for people who want to know the reliability of consumer drives in high-stress environments like datacenters...?
21:22 🔗 Medowar These use-cases are still unrealistic. Normal operation of a consumer-grade drive are many spinups and downs and backblaze has basically none of them. So they are just testing a subset of harddrive-life and the portion, that does not produce the most stress on drives.
21:23 🔗 godane so i lost my cat to a coyote
21:23 🔗 godane here is a picture of him: https://scontent-lga3-1.xx.fbcdn.net/t31.0-8/q81/s960x960/13217022_10204588349268713_1457902763586552308_o.jpg
21:24 🔗 dashcloud has quit IRC (Read error: Operation timed out)
21:24 🔗 godane we called him Romeo
21:25 🔗 luckcolor godane: poor kitty
21:25 🔗 Frogging Medowar: The data is useless to you then, but that doesn't mean it's useless to everyone
21:26 🔗 JW_work godane: sorry to hear that. https://archive.is/20160517212505/https://scontent-lga3-1.xx.fbcdn.net/t31.0-8/q81/s960x960/13217022_10204588349268713_1457902763586552308_o.jpg
21:26 🔗 Frogging godane: aw :/
21:27 🔗 HCross **man hugs** godane - were here for you
21:27 🔗 dashcloud has joined #archiveteam-bs
21:28 🔗 luckcolor yeah :)
21:29 🔗 godane based on what we can tell it was quick
21:30 🔗 luckcolor how old was him?
21:30 🔗 xmc godane: :'(
21:30 🔗 JW_work also copied to http://imgur.com/toKufwb
21:31 🔗 godane 15 years old
21:31 🔗 godane https://scontent-lga3-1.xx.fbcdn.net/v/t1.0-9/13263951_10204588442711049_8140892414389226821_n.jpg?oh=8a8ce4dd363c59e012a47bb14340077f&oe=579F4A16
21:31 🔗 godane another picture
21:31 🔗 JW_work awww :-/
21:36 🔗 luckcolor that's sooo cute
21:40 🔗 Stiletto has joined #archiveteam-bs
21:51 🔗 midas :< godane that makes me sad
22:33 🔗 bwn has quit IRC (Read error: Operation timed out)
22:34 🔗 metalcamp has quit IRC (Ping timeout: 244 seconds)
22:43 🔗 bwn has joined #archiveteam-bs
22:48 🔗 tomwsmf-a has joined #archiveteam-bs
22:59 🔗 w0rp has quit IRC (Read error: Operation timed out)
23:00 🔗 w0rp has joined #archiveteam-bs
23:21 🔗 JW_work https://github.com/lintool/warcbase and https://github.com/helgeho/ArchiveSpark should be added to http://archiveteam.org/index.php?title=The_WARC_Ecosystem
23:23 🔗 BlueMaxim has joined #archiveteam-bs
23:55 🔗 JordanJ2 has joined #archiveteam-bs

irclogger-viewer