Time |
Nickname |
Message |
00:15
🔗
|
lemonkey |
ah ok |
00:57
🔗
|
Warchell |
Good day. Can I crawl a warc archive served via warc-proxy with httrack? I have troubles with forming a scan url. |
07:53
🔗
|
spiralofh |
I'm trying to sign up to the wiki. What's the secret word? |
07:56
🔗
|
Nemo_bis |
spiralofh: yahoosucks |
07:58
🔗
|
spiralofh |
Nemo_bis: Indeed it does. Signup complete, thanks. |
07:59
🔗
|
Nemo_bis |
yw |
09:49
🔗
|
REiN^ |
WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD |
09:51
🔗
|
Nemo_bis |
yahoosucks |
09:54
🔗
|
REiN^ |
hi how i can receive a secter word to complete reg? |
09:55
🔗
|
Nemo_bis |
REiN^: you just did, use it |
09:59
🔗
|
REiN^ |
thx |
18:33
🔗
|
joepie91 |
still on IRC break |
18:33
🔗
|
joepie91 |
but briefly popping in here |
18:33
🔗
|
joepie91 |
Justin.tv is baleeting all archived broadcasts in a week (!) |
18:34
🔗
|
joepie91 |
http://techcrunch.com/2014/06/01/justin-tv-to-kill-off-its-built-in-video-archiving-system/ |
18:34
🔗
|
joepie91 |
according to friend, youtube-dl has code for downloading said broadcasts |
18:34
🔗
|
joepie91 |
but it's quite a lot |
18:34
🔗
|
exmic |
joepie91: yep, we're on it |
18:34
🔗
|
exmic |
:) |
18:37
🔗
|
oink |
WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD |
18:37
🔗
|
antomatic |
oink: yahoosucks |
18:37
🔗
|
ersi |
;-) |
18:37
🔗
|
oink |
ty |
18:38
🔗
|
joepie91 |
exmic: awesome |
18:38
🔗
|
joepie91 |
:) |
18:39
🔗
|
antomatic |
Gents... permission to tantrum? |
18:40
🔗
|
antomatic |
Thanks. Let me get right to it.. |
18:40
🔗
|
antomatic |
There are so many people turning up wanting to help, and finding that there's nothing to do. |
18:40
🔗
|
antomatic |
Even before Justin there was a steady stream of people saying "why does nothing in the warrior work" |
18:41
🔗
|
joepie91 |
oh, yes, about that, I have some good news |
18:41
🔗
|
antomatic |
Even 6 days away from the Justin deletions, all the work is pretty much happening behind closed doors. |
18:41
🔗
|
antomatic |
People want to help and we're giving them nothing to do. |
18:41
🔗
|
* |
antomatic listens. |
18:41
🔗
|
joepie91 |
I've quit my job, so assuming my fundraiser turns out well, I'd have a year long where I can help on archiveteam stuff where necessary |
18:41
🔗
|
joepie91 |
without stupid time constraints |
18:41
🔗
|
joepie91 |
dev-wise |
18:41
🔗
|
joepie91 |
:) |
18:41
🔗
|
exmic |
antomatic: sure, join #justouttv and help write the warrior job |
18:42
🔗
|
exmic |
but you're already there, ok |
18:42
🔗
|
antomatic |
I wish I could write warrior code but alas, my kungfu is not quite that good. |
18:42
🔗
|
exmic |
yes, archiveteam doesn't feel quite as gangbusters as it did last year |
18:42
🔗
|
antomatic |
I apprecaite it's rich to bitch about something that I can't actually execute myself, |
18:42
🔗
|
exmic |
heh, that's for sure |
18:42
🔗
|
antomatic |
but I think the observation is hopefully at least partially valid |
18:42
🔗
|
joepie91 |
antomatic: I do this on a frequent basis, it still helps a lot |
18:43
🔗
|
joepie91 |
anyway, I'll probably have the time soon to really learn the specifics of warrior things |
18:43
🔗
|
joepie91 |
and write stuff for archiveteam |
18:43
🔗
|
joepie91 |
:D |
18:43
🔗
|
* |
joepie91 skirts the -bs line |
18:43
🔗
|
antomatic |
Hail Joepie, our saviour. :) |
18:44
🔗
|
joepie91 |
relatedly on-topic, my pastebin scraper has magically started working again |
18:44
🔗
|
joepie91 |
??? |
18:44
🔗
|
joepie91 |
it was blocked for like 2 weeks, but apparently unblocked now, or something |
18:44
🔗
|
joepie91 |
idk the specifics, but paste dumps are showing up again |
18:44
🔗
|
exmic |
weird, but ok |
18:44
🔗
|
exmic |
where do you put them? |
18:44
🔗
|
joepie91 |
archive.org/details/pastebinpastes |
18:45
🔗
|
joepie91 |
it's a daily cron |
18:45
🔗
|
exmic |
good stuff |
18:45
🔗
|
joepie91 |
also relatedly, I've set up a PDF host that auto-mirrors all public docs to archive.org :D |
18:45
🔗
|
exmic |
I saw that, good stuff |
18:45
🔗
|
joepie91 |
http://pdf.yt/ / https://archive.org/details/pdfymirrors |
18:45
🔗
|
joepie91 |
it appears to be a pretty efficient sinkhole so far |
18:46
🔗
|
exmic |
any particular reason you're not also pastebin-scraping to warc? |
18:46
🔗
|
joepie91 |
because I'm running a custom scraping script that doesn't do warc |
18:46
🔗
|
exmic |
far be it from me to fault you for doing it |
18:46
🔗
|
exmic |
aye |
18:46
🔗
|
joepie91 |
it just grabs the paste contents and metadata |
18:47
🔗
|
joepie91 |
there's not much point to saving a WARC |
18:47
🔗
|
exmic |
https://github.com/odie5533/WarcProxy |
18:47
🔗
|
joepie91 |
it literally saves the /raw/ paste |
18:47
🔗
|
joepie91 |
not the paste page |
18:47
🔗
|
exmic |
sure |
18:47
🔗
|
exmic |
ah |
18:47
🔗
|
joepie91 |
I don't like relying on local proxies :) |
18:47
🔗
|
exmic |
fair enough |
18:47
🔗
|
joepie91 |
I still need to look into the Python WARC ecosystem |
18:48
🔗
|
joepie91 |
as well as the Node.js WARC ecosystem, since I've been dabbling in Node.js lately |
18:48
🔗
|
exmic |
hm, I haven't |
18:48
🔗
|
joepie91 |
currently busy porting the internetarchive module to Node.js |
18:48
🔗
|
ersi |
holy moly, what a #firehose |
18:48
🔗
|
joepie91 |
ersi: firehose? |
18:48
🔗
|
exmic |
THINGS & STUFF GOING ON IN #ARCHIVETEAM OMG |
18:49
🔗
|
ersi |
cool that your pastebinscraper started working again |
18:49
🔗
|
joepie91 |
lol |
18:49
🔗
|
monod |
You're alive yeeeeeeeeeeee |
18:49
🔗
|
joepie91 |
yeah, idk what's up with that, either there's just a 1 in a million chance that it gets hit with a temp ban, or the pastebin guy is okay with it existing |
18:49
🔗
|
joepie91 |
monod: I am, just taking a prolonged break from IRC :) |
18:49
🔗
|
monod |
I feel much better now :D |
18:49
🔗
|
monod |
You're doing great! |
18:49
🔗
|
monod |
Good luck and cya soon! ;) |
18:50
🔗
|
joepie91 |
monod: PM :P |
18:50
🔗
|
exmic |
you look good, joepie91! |
20:12
🔗
|
SketchCow |
http://pcasts.in/xT7Z has many great things to say about Archive Team after minute 20. |