Time |
Nickname |
Message |
00:05
🔗
|
chfoo |
yipdw (or someone else): could you clean up projects.json since they finished/on hiatus and add terroroftinytown-client-grab to it? make sure to state that it's a work in progress. |
00:26
🔗
|
Famicoman |
Anyone know what exactly you need to do to get IA to create a pdf, etc when you upload a zip of images? |
00:26
🔗
|
Famicoman |
is it just xxxx_images.zip loaded with tiffs? |
00:54
🔗
|
exmic |
yup |
00:54
🔗
|
exmic |
tiff, jpeg, png, whatever sort of image |
01:01
🔗
|
frogor |
Is there anything that anyone can do to help at the moment? Not sure where current progress is, available tools, things being worked on, etc. re: justin.tv |
01:09
🔗
|
Famicoman |
go to #justouttv |
01:09
🔗
|
Famicoman |
thanks exmic |
01:10
🔗
|
frogor |
Yup. In there, dead silent. |
01:10
🔗
|
Famicoman |
guess you're a pioneer |
01:13
🔗
|
DFJustin |
gif images don't work fwiw |
03:00
🔗
|
yipdw |
chfoo: yeah |
03:00
🔗
|
yipdw |
chfoo: one sec |
06:30
🔗
|
closure |
here's 400gb storage on a vps for $7/month. http://lowendbox.com/blog/xenpower-15-99quarter-1gb200gb-and-20-05quarter-2gb400gb-xen-vps-in-milan-italy/ |
06:30
🔗
|
closure |
I think OVH sometimes has better deals? |
06:42
🔗
|
exmic |
damnit |
06:42
🔗
|
exmic |
came to the coffeeshop to get work done |
06:42
🔗
|
exmic |
wound up setting up an archivebot worker instead |
06:44
🔗
|
voltagex |
hey, anyone here helping out with justin.tv? |
06:44
🔗
|
closure |
yeah, I'm hoping to use above 400 gb for tat |
06:44
🔗
|
closure |
though it only has 3 tb/mon |
06:45
🔗
|
voltagex |
closure: I'm trying to grab all channel pages, how are your curl skills? |
06:48
🔗
|
voltagex |
I need someone to run ~30*2000 curls in parallel :P |
06:51
🔗
|
exmic |
Iiii could probably swing that |
06:52
🔗
|
voltagex |
I'm scraping with curl "http://www.justin.tv/search?q=a&only=archives&sort-by=count&only=users&page=[1-2974]" -o "#1.html" but I'm not getting real errors on failure so I need a hand |
06:52
🔗
|
voltagex |
sorry for all the requests |
06:52
🔗
|
voltagex |
just want to jump in and help |
06:54
🔗
|
exmic |
no worries friend |
06:56
🔗
|
closure |
so you want to run 30 wgets at a time, dividing up that url space? |
06:57
🔗
|
voltagex |
just narrowing the search space now |
06:57
🔗
|
voltagex |
trying to work out if the search is case sensitive |
06:58
🔗
|
voltagex |
and I'd only grab those pages if it's going to be useful for someone |
06:59
🔗
|
voltagex |
basically grabbing those will give you a list of all channels with archives ever |
06:59
🔗
|
voltagex |
(in theory) |
06:59
🔗
|
voltagex |
okay, so searches are NOT case sensitive |
06:59
🔗
|
voltagex |
which is awesome |
07:00
🔗
|
exmic |
great, a list of all channels with archives? |
07:00
🔗
|
exmic |
that's handy |
07:00
🔗
|
voltagex |
yes, but unparsed right now |
07:00
🔗
|
voltagex |
trying and failing to do it one step at a time |
07:00
🔗
|
exmic |
baby steps very quickly |
07:00
🔗
|
exmic |
is how you get places |
07:06
🔗
|
voltagex |
I have the number of pages for each letter/number searched... doesn't seem like enough total channels |
07:06
🔗
|
exmic |
how many? |
07:08
🔗
|
voltagex |
http://pastebin.com/cK1P3dhw |
07:09
🔗
|
voltagex |
oh, nevermind |
07:09
🔗
|
voltagex |
those are *pages* per channel |
07:09
🔗
|
voltagex |
blah |
07:09
🔗
|
voltagex |
try again |
07:09
🔗
|
voltagex |
pages per search result. |
07:09
🔗
|
exmic |
that file sums to 43,162 |
07:09
🔗
|
exmic |
so 43k pages of search result, ok |
07:10
🔗
|
exmic |
how many per page appx? |
07:10
🔗
|
voltagex |
10 exactly |
07:10
🔗
|
voltagex |
except for last page |
07:10
🔗
|
voltagex |
so 10 average :P |
07:10
🔗
|
exmic |
so slightly< 430k channels to look at |
07:10
🔗
|
exmic |
that's reasonable |
07:11
🔗
|
voltagex |
that was done by literally searching for a, b, c, d etc. |
07:11
🔗
|
voltagex |
so I'm not sure how good that is |
07:11
🔗
|
exmic |
ahhh |
07:12
🔗
|
voltagex |
I couldn't find another way to do it |
07:12
🔗
|
exmic |
probably going to be significant overlap then |
07:12
🔗
|
voltagex |
...that's good, right? |
07:12
🔗
|
exmic |
yes, reduces the amount of things we have to look at |
07:12
🔗
|
exmic |
once we grab all the search pages |
07:12
🔗
|
exmic |
oh god now I have like 6 tabs playing video |
07:13
🔗
|
exmic |
this is awful |
07:13
🔗
|
exmic |
voltagex: we can move this to #archiveteam, this is ontopic |
07:13
🔗
|
voltagex |
exmic: ah, just that this channel was awake |
07:13
🔗
|
exmic |
yeah they're right next to each other but I spend more time looking at this one because it tends to have more talk |
08:34
🔗
|
curi |
what's the -bs in the channel name mean? |
08:35
🔗
|
exmic |
bullshit |
08:36
🔗
|
exmic |
hey look, somebody is whining on the internet https://news.ycombinator.com/item?id=7828542 |
08:36
🔗
|
curi |
why are there two channels, one for bs? |
08:36
🔗
|
curi |
i came here cuz of link on YC btw, just curious waht's going on |
08:37
🔗
|
exmic |
we try to separate signal from noise |
08:37
🔗
|
curi |
i use the ~2 week archives on twitch a lot, seems awful to remove archiving entirely on jtv |
08:37
🔗
|
ivan` |
I wonder what the real number is "If you do the stats you'll notice that over 99.99% of the content in archive.org is never accessed. Nobody cares." |
08:37
🔗
|
curi |
like today 2 ppl were streaming at once so i'm watching the archive video of one of them after... |
08:38
🔗
|
exmic |
upon intially reading that I assumed they meant "most of it isn't looked at, and that's ok" |
08:39
🔗
|
curi |
> If you do the stats you'll notice that over 99.99% of the content in archive.org is never accessed. Nobody cares. |
08:40
🔗
|
curi |
man this guy. i've looked up super obscure stuff on archive.org before |
08:40
🔗
|
curi |
it's really nice |
08:40
🔗
|
exmic |
indeed |
08:40
🔗
|
exmic |
and we try to fill in the gaps between *those* things |
08:43
🔗
|
curi |
he meant most of it isn't looked at, and popularity contests should rule archiving too not just the schoolyard and hollywood |
08:52
🔗
|
DFJustin |
ivan`: your reply is well put |
08:53
🔗
|
ivan` |
thanks |
09:37
🔗
|
godane |
uploaded: https://archive.org/details/dvdrom-lki-72 |
09:43
🔗
|
voltagex |
https://gist.github.com/voltagex/6067ee19df87dac7072c |
11:56
🔗
|
voltagex |
choo choo |
11:56
🔗
|
voltagex |
all abord the archive train |
12:19
🔗
|
voltagex |
http://carina.whatbox.ca:12500/justin.tar.gz for useful HTML |
13:42
🔗
|
voltagex |
thanks to everyone who helped me out today |
18:26
🔗
|
yipdw |
so for any home cooks here, you should give the Beyond Meat stuff a try |
18:26
🔗
|
yipdw |
I just tried the chicken out for a stir-fry and it's actually really, really good, if you don't burn it |
18:26
🔗
|
yipdw |
(if you do it is very obvious that what you cooked is not what you remember) |
18:27
🔗
|
midas |
im a proper home cook, i buy stuff, order food online and throw away the stuff i bought. |
18:27
🔗
|
yipdw |
like, it starts to take on a texture less like chicken and more like fried tofu |
18:30
🔗
|
schbirid |
yipdw: link? |
18:30
🔗
|
midas |
http://beyondmeat.com/ |
18:31
🔗
|
ersi |
Ok? |
18:33
🔗
|
yipdw |
ersi: hey, it's -bs, I figured why not |
18:34
🔗
|
schbirid |
thanks! |
18:34
🔗
|
ersi |
sure, I just didn't read your lines - so when I opened up that link I had no context :) |
18:34
🔗
|
yipdw |
schbirid: the chicken does not behave like real chicken in one very important aspect, which is that there is very little fat in the strips |
18:34
🔗
|
schbirid |
oh i thought it was some recipies, this is some product? |
18:35
🔗
|
yipdw |
so you won't get the crackle, and you do lose the ability to use the fat to flavor |
18:35
🔗
|
yipdw |
meaning that you'll probably want to compensate with additional oil/butter etc |
18:35
🔗
|
yipdw |
but yeah |
18:35
🔗
|
yipdw |
it's a product |
18:35
🔗
|
schbirid |
ersi: be happy it wasnt http://beyondmeatspin.com/ |
18:35
🔗
|
schbirid |
ah ok :( |
18:35
🔗
|
antomatic |
is meatspin like leekspin? :) |
18:36
🔗
|
schbirid |
oooh |
18:36
🔗
|
yipdw |
on the otherhand, you don't have nearly as much cleanup to do if you for some reason don't have a splatter guard |
18:36
🔗
|
schbirid |
do not visit meatspin if you have to ask |
18:36
🔗
|
antomatic |
I am sure leekspin is better, then. :) |
18:37
🔗
|
ersi |
:D |
18:37
🔗
|
schbirid |
it was inspired by meatspin, i uhm prefer meatspin in a totally humorous way :P |
18:53
🔗
|
godane |
uploaded: https://archive.org/details/dvdrom-lki-73 |
18:53
🔗
|
godane |
so all of lki dvds from 2007 are finally uploaded |
23:25
🔗
|
nico |
i hate when i get disconnected |
23:29
🔗
|
balrog |
justin.tv deserves to be hassled a lot about this. |
23:29
🔗
|
balrog |
SketchCow: ^ |