#archiveteam-bs 2014-06-01,Sun

↑back Search

Time Nickname Message
00:05 🔗 chfoo yipdw (or someone else): could you clean up projects.json since they finished/on hiatus and add terroroftinytown-client-grab to it? make sure to state that it's a work in progress.
00:26 🔗 Famicoman Anyone know what exactly you need to do to get IA to create a pdf, etc when you upload a zip of images?
00:26 🔗 Famicoman is it just xxxx_images.zip loaded with tiffs?
00:54 🔗 exmic yup
00:54 🔗 exmic tiff, jpeg, png, whatever sort of image
01:01 🔗 frogor Is there anything that anyone can do to help at the moment? Not sure where current progress is, available tools, things being worked on, etc. re: justin.tv
01:09 🔗 Famicoman go to #justouttv
01:09 🔗 Famicoman thanks exmic
01:10 🔗 frogor Yup. In there, dead silent.
01:10 🔗 Famicoman guess you're a pioneer
01:13 🔗 DFJustin gif images don't work fwiw
03:00 🔗 yipdw chfoo: yeah
03:00 🔗 yipdw chfoo: one sec
06:30 🔗 closure here's 400gb storage on a vps for $7/month. http://lowendbox.com/blog/xenpower-15-99quarter-1gb200gb-and-20-05quarter-2gb400gb-xen-vps-in-milan-italy/
06:30 🔗 closure I think OVH sometimes has better deals?
06:42 🔗 exmic damnit
06:42 🔗 exmic came to the coffeeshop to get work done
06:42 🔗 exmic wound up setting up an archivebot worker instead
06:44 🔗 voltagex hey, anyone here helping out with justin.tv?
06:44 🔗 closure yeah, I'm hoping to use above 400 gb for tat
06:44 🔗 closure though it only has 3 tb/mon
06:45 🔗 voltagex closure: I'm trying to grab all channel pages, how are your curl skills?
06:48 🔗 voltagex I need someone to run ~30*2000 curls in parallel :P
06:51 🔗 exmic Iiii could probably swing that
06:52 🔗 voltagex I'm scraping with curl "http://www.justin.tv/search?q=a&only=archives&sort-by=count&only=users&page=[1-2974]" -o "#1.html" but I'm not getting real errors on failure so I need a hand
06:52 🔗 voltagex sorry for all the requests
06:52 🔗 voltagex just want to jump in and help
06:54 🔗 exmic no worries friend
06:56 🔗 closure so you want to run 30 wgets at a time, dividing up that url space?
06:57 🔗 voltagex just narrowing the search space now
06:57 🔗 voltagex trying to work out if the search is case sensitive
06:58 🔗 voltagex and I'd only grab those pages if it's going to be useful for someone
06:59 🔗 voltagex basically grabbing those will give you a list of all channels with archives ever
06:59 🔗 voltagex (in theory)
06:59 🔗 voltagex okay, so searches are NOT case sensitive
06:59 🔗 voltagex which is awesome
07:00 🔗 exmic great, a list of all channels with archives?
07:00 🔗 exmic that's handy
07:00 🔗 voltagex yes, but unparsed right now
07:00 🔗 voltagex trying and failing to do it one step at a time
07:00 🔗 exmic baby steps very quickly
07:00 🔗 exmic is how you get places
07:06 🔗 voltagex I have the number of pages for each letter/number searched... doesn't seem like enough total channels
07:06 🔗 exmic how many?
07:08 🔗 voltagex http://pastebin.com/cK1P3dhw
07:09 🔗 voltagex oh, nevermind
07:09 🔗 voltagex those are *pages* per channel
07:09 🔗 voltagex blah
07:09 🔗 voltagex try again
07:09 🔗 voltagex pages per search result.
07:09 🔗 exmic that file sums to 43,162
07:09 🔗 exmic so 43k pages of search result, ok
07:10 🔗 exmic how many per page appx?
07:10 🔗 voltagex 10 exactly
07:10 🔗 voltagex except for last page
07:10 🔗 voltagex so 10 average :P
07:10 🔗 exmic so slightly< 430k channels to look at
07:10 🔗 exmic that's reasonable
07:11 🔗 voltagex that was done by literally searching for a, b, c, d etc.
07:11 🔗 voltagex so I'm not sure how good that is
07:11 🔗 exmic ahhh
07:12 🔗 voltagex I couldn't find another way to do it
07:12 🔗 exmic probably going to be significant overlap then
07:12 🔗 voltagex ...that's good, right?
07:12 🔗 exmic yes, reduces the amount of things we have to look at
07:12 🔗 exmic once we grab all the search pages
07:12 🔗 exmic oh god now I have like 6 tabs playing video
07:13 🔗 exmic this is awful
07:13 🔗 exmic voltagex: we can move this to #archiveteam, this is ontopic
07:13 🔗 voltagex exmic: ah, just that this channel was awake
07:13 🔗 exmic yeah they're right next to each other but I spend more time looking at this one because it tends to have more talk
08:34 🔗 curi what's the -bs in the channel name mean?
08:35 🔗 exmic bullshit
08:36 🔗 exmic hey look, somebody is whining on the internet https://news.ycombinator.com/item?id=7828542
08:36 🔗 curi why are there two channels, one for bs?
08:36 🔗 curi i came here cuz of link on YC btw, just curious waht's going on
08:37 🔗 exmic we try to separate signal from noise
08:37 🔗 curi i use the ~2 week archives on twitch a lot, seems awful to remove archiving entirely on jtv
08:37 🔗 ivan` I wonder what the real number is "If you do the stats you'll notice that over 99.99% of the content in archive.org is never accessed. Nobody cares."
08:37 🔗 curi like today 2 ppl were streaming at once so i'm watching the archive video of one of them after...
08:38 🔗 exmic upon intially reading that I assumed they meant "most of it isn't looked at, and that's ok"
08:39 🔗 curi > If you do the stats you'll notice that over 99.99% of the content in archive.org is never accessed. Nobody cares.
08:40 🔗 curi man this guy. i've looked up super obscure stuff on archive.org before
08:40 🔗 curi it's really nice
08:40 🔗 exmic indeed
08:40 🔗 exmic and we try to fill in the gaps between *those* things
08:43 🔗 curi he meant most of it isn't looked at, and popularity contests should rule archiving too not just the schoolyard and hollywood
08:52 🔗 DFJustin ivan`: your reply is well put
08:53 🔗 ivan` thanks
09:37 🔗 godane uploaded: https://archive.org/details/dvdrom-lki-72
09:43 🔗 voltagex https://gist.github.com/voltagex/6067ee19df87dac7072c
11:56 🔗 voltagex choo choo
11:56 🔗 voltagex all abord the archive train
12:19 🔗 voltagex http://carina.whatbox.ca:12500/justin.tar.gz for useful HTML
13:42 🔗 voltagex thanks to everyone who helped me out today
18:26 🔗 yipdw so for any home cooks here, you should give the Beyond Meat stuff a try
18:26 🔗 yipdw I just tried the chicken out for a stir-fry and it's actually really, really good, if you don't burn it
18:26 🔗 yipdw (if you do it is very obvious that what you cooked is not what you remember)
18:27 🔗 midas im a proper home cook, i buy stuff, order food online and throw away the stuff i bought.
18:27 🔗 yipdw like, it starts to take on a texture less like chicken and more like fried tofu
18:30 🔗 schbirid yipdw: link?
18:30 🔗 midas http://beyondmeat.com/
18:31 🔗 ersi Ok?
18:33 🔗 yipdw ersi: hey, it's -bs, I figured why not
18:34 🔗 schbirid thanks!
18:34 🔗 ersi sure, I just didn't read your lines - so when I opened up that link I had no context :)
18:34 🔗 yipdw schbirid: the chicken does not behave like real chicken in one very important aspect, which is that there is very little fat in the strips
18:34 🔗 schbirid oh i thought it was some recipies, this is some product?
18:35 🔗 yipdw so you won't get the crackle, and you do lose the ability to use the fat to flavor
18:35 🔗 yipdw meaning that you'll probably want to compensate with additional oil/butter etc
18:35 🔗 yipdw but yeah
18:35 🔗 yipdw it's a product
18:35 🔗 schbirid ersi: be happy it wasnt http://beyondmeatspin.com/
18:35 🔗 schbirid ah ok :(
18:35 🔗 antomatic is meatspin like leekspin? :)
18:36 🔗 schbirid oooh
18:36 🔗 yipdw on the otherhand, you don't have nearly as much cleanup to do if you for some reason don't have a splatter guard
18:36 🔗 schbirid do not visit meatspin if you have to ask
18:36 🔗 antomatic I am sure leekspin is better, then. :)
18:37 🔗 ersi :D
18:37 🔗 schbirid it was inspired by meatspin, i uhm prefer meatspin in a totally humorous way :P
18:53 🔗 godane uploaded: https://archive.org/details/dvdrom-lki-73
18:53 🔗 godane so all of lki dvds from 2007 are finally uploaded
23:25 🔗 nico i hate when i get disconnected
23:29 🔗 balrog justin.tv deserves to be hassled a lot about this.
23:29 🔗 balrog SketchCow: ^

irclogger-viewer