[00:21] *** Jasjar has quit IRC (se.hub irc.efnet.nl) [00:21] *** ats has quit IRC (se.hub irc.efnet.nl) [00:21] *** pl0nk has quit IRC (se.hub irc.efnet.nl) [00:21] *** Stiletto has quit IRC (se.hub irc.efnet.nl) [00:21] *** VoynichCr has quit IRC (se.hub irc.efnet.nl) [00:23] *** Jasjar has joined #internetarchive [00:23] *** ats has joined #internetarchive [00:23] *** pl0nk has joined #internetarchive [00:23] *** Stiletto has joined #internetarchive [00:23] *** VoynichCr has joined #internetarchive [01:19] *** X-Scale has quit IRC (Ping timeout: 492 seconds) [01:59] *** X-Scale has joined #internetarchive [02:24] *** balrog has quit IRC (Quit: Bye) [02:29] *** balrog has joined #internetarchive [04:38] *** qw3rty112 has joined #internetarchive [04:45] *** qw3rty111 has quit IRC (Read error: Operation timed out) [04:45] *** sivoais has quit IRC (Read error: Operation timed out) [04:45] *** sivoais has joined #internetarchive [04:49] *** odemg has quit IRC (Ping timeout: 265 seconds) [05:01] *** odemg has joined #internetarchive [06:08] https://bugs.chromium.org/robots.txt wayback thinks this blocks e.g. https://web.archive.org/save/https://bugs.chromium.org/p/chromium/issues/detail?id=896897 [08:17] *** atomotic has joined #internetarchive [09:21] *** qw3rty113 has joined #internetarchive [09:28] *** qw3rty112 has quit IRC (Read error: Operation timed out) [10:16] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [10:37] Yeah, I've seen that before. IA's robots.txt parser doesn't seem to handle Allow directives very well. [12:52] it's been like this for over 2 years. I talked to Jeff on IA forums and he basically said "hm, thats the way it is with the crawler" [14:15] 300 PB of Twitter... only one order of magnitude bigger than IA :) https://cloud.google.com/twitter/ [17:50] hmm did they stop generating .djvu on new items [18:02] https://archive.org/post/1053214/djvu-files-for-new-uploads [18:09] boo [18:48] *** picklefac has joined #internetarchive [20:23] *** jrwr has quit IRC (Read error: Connection reset by peer) [20:24] *** jrwr has joined #internetarchive [20:24] *** svchfoo1 sets mode: +o jrwr [20:24] *** svchfoo3 sets mode: +o jrwr [22:49] Hrm, anyone else seeing S3 failures? Specifically, "Broken pipe". [22:57] *** kiska1 has quit IRC (Ping timeout (120 seconds)) [22:58] *** kiska1 has joined #internetarchive