[00:17] *** bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) [00:43] *** MR9K4 has joined #archiveteam-bs [01:30] *** drcd has quit IRC (Read error: Connection reset by peer) [01:32] *** d5f4a3622 has quit IRC (Quit: WeeChat 2.4) [01:34] *** d5f4a3622 has joined #archiveteam-bs [02:03] *** bitBaron has joined #archiveteam-bs [02:15] *** ayanami_ has quit IRC (Quit: Leaving) [02:19] *** bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) [02:21] *** enowaldo has joined #archiveteam-bs [02:26] *** enowaldo has quit IRC (Ping timeout: 268 seconds) [02:42] *** benjins has joined #archiveteam-bs [02:42] *** BlueMax has joined #archiveteam-bs [03:20] *** odemgi has joined #archiveteam-bs [03:23] *** odemgi_ has quit IRC (Ping timeout: 252 seconds) [03:32] *** GuysFree has quit IRC (Quit: Connection closed for inactivity) [03:33] i'm now archiving The Mike Rosen Show [03:34] he retired in 2015 then i think maybe doing a at the movies radio show up to 2017 [03:35] what bothers me is that there mp3s going back to 2008 but only from 2011-09-12 on i'm able to grab [03:36] thats cause older urls look like this : http://a1135.g.akamai.net/f/1135/18227/1h/cchannel.download.akamai.com/18227/podcast/DENVER-CO/KOA-AM/Rosen08-3-09-11AM.mp3 [03:40] later urls are like this : http://media.ccomrcdn.com/media/station_content/668/Rosen11-15-11-09AM_1321905631_28012.mp3 [04:43] *** TC01 has joined #archiveteam-bs [04:49] *** TC01_ has quit IRC (Ping timeout: 615 seconds) [04:52] *** ndiddy has quit IRC () [05:09] is https://archive.org/details/github_narabot_mirror part of archiveteam? couldn't find anything about it [05:26] *** BlueMax has quit IRC (Quit: Leaving) [05:50] *** BlueMax has joined #archiveteam-bs [06:13] *** Zerote has quit IRC (Ping timeout: 600 seconds) [07:18] *** MrRadar2 has quit IRC (Read error: Operation timed out) [07:18] *** BnAboyZ has quit IRC (Read error: Operation timed out) [07:22] *** colona has quit IRC (Ping timeout: 265 seconds) [07:24] *** colona has joined #archiveteam-bs [07:27] *** BnAboyZ has joined #archiveteam-bs [07:28] *** MrRadar2 has joined #archiveteam-bs [07:29] *** svchfoo3 sets mode: +o MrRadar2 [07:35] *** Zerote has joined #archiveteam-bs [08:53] Wrt Sony’s sketch: There seem to be ~200M sketches and it seems I can retrive ids (UUID) for all of them easily, but it takes some time. [08:54] *** Zerote has quit IRC (Read error: Operation timed out) [08:54] Sweet. Fortunately, we have 5 months. [08:57] *** Zerote has joined #archiveteam-bs [08:58] *** benjinsmi has joined #archiveteam-bs [09:01] *** benjins has quit IRC (Read error: Operation timed out) [09:02] *** Odd0002_ has joined #archiveteam-bs [09:07] *** Odd0002 has quit IRC (Ping timeout: 615 seconds) [09:07] *** Odd0002_ is now known as Odd0002 [10:26] *** enowaldo has joined #archiveteam-bs [10:27] *** Verified_ has quit IRC (Remote host closed the connection) [11:01] *** enowaldo has quit IRC (Ping timeout: 265 seconds) [11:32] *** enowaldo has joined #archiveteam-bs [11:46] *** enowaldo has quit IRC (Ping timeout: 268 seconds) [11:51] *** kiska1 has quit IRC (Read error: Connection reset by peer) [11:52] *** kiska1 has joined #archiveteam-bs [11:52] *** svchfoo3 sets mode: +o kiska1 [12:00] *** deathy has quit IRC (Read error: Connection reset by peer) [12:01] *** diggan has quit IRC (Read error: Connection reset by peer) [12:03] *** diggan has joined #archiveteam-bs [12:04] *** deathy has joined #archiveteam-bs [12:08] *** bitBaron has joined #archiveteam-bs [12:19] *** enowaldo has joined #archiveteam-bs [12:25] *** BlueMax has quit IRC (Quit: Leaving) [12:31] *** icedice has joined #archiveteam-bs [12:45] *** enowaldo has quit IRC (Ping timeout: 492 seconds) [12:47] *** bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) [12:59] *** enowaldo has joined #archiveteam-bs [13:05] *** cfarquhar has quit IRC (Read error: Operation timed out) [13:10] *** Odd0002_ has joined #archiveteam-bs [13:13] *** cfarquhar has joined #archiveteam-bs [13:16] *** Odd0002 has quit IRC (Read error: Operation timed out) [13:16] *** Odd0002_ is now known as Odd0002 [13:17] *** VerifiedJ has joined #archiveteam-bs [13:21] *** enowaldo has quit IRC (Read error: Operation timed out) [13:43] *** cfarquhar has quit IRC (Read error: Operation timed out) [13:48] *** enowaldo has joined #archiveteam-bs [13:51] *** cfarquhar has joined #archiveteam-bs [13:58] I have no idea what my bot will do with that edit, but we'll see in a minute or so. [14:11] JAA: disallowing ia_archiver no longer prevents sites from being viewed [14:12] hook54321: Is that definite now? I know it happened in the past due to bugs in the robots parser or something. But please feel free to correct that (I didn't write it, btw). [14:19] ah ok, I assumed you did, my bad. I haven't seen any official documentation saying that it's like that for all sites now (however lots of IA's policies aren't documented), but I haven't seen the robots.txt error message in a long time, even on sites that specifically disallow it. [14:19] Save Page Now ignores robots.txt as well [14:20] https://twitter.com/MarkGraham/status/1113503847228395521 [14:26] *** enowaldo has quit IRC (Read error: Operation timed out) [14:27] That's great news. [14:37] Yeah. [14:40] I kinda understand why their other crawlers (and Alexa's) still pay attention to it in most cases. I'm guessing it keeps crawls more sane since they might not be monitored all the time. For stuff like grabbing pages linked to on Wikipedia though I hope they ignore it. [14:41] And they would likely get widely blocked if they didn't. [14:48] "List of unreliable URL shorteners" - "Avoid using them. Use TinyURL.com instead." Someone clearly doesn't know about URLTeam. [14:49] hook54321: Yeah, probably makes sense there. Although their web-wide crawls are only recursing to a limited depth usually. Bans could definitely be an issue though. [14:50] *** ATrescue has joined #archiveteam-bs [14:50] Got invited by JAA. [14:50] ATrescue: Regarding your "List of unreliable URL shorteners", I guess you haven't yet read about URLTeam, have you? [14:51] JAA: Not yet. [14:51] I suggest you do then. That list would be a duplicate basically. [14:51] Also, all URL shorteners are bad, not just the ones you consider unreliable (by whatever measure). [14:52] The only exception are service-internal shorteners, like git.io for GitHub. Chances are that those will survive as long as the service exists, and they'd be useless if the service collapses. [14:54] @JAA I took a look. That's a very extensive, impressive list. git.io is like t.co (t.co shorts all URL's posted in tweets, even after the original tweet is unavailable). [14:55] eh [14:56] They link to external sites though, and if twitter ever shut down those links would then be useless. [14:58] hook54321: Twitter isn't likely to shutdown anytime soon, but who knows? I have archived many t.co links yesterday. [14:58] *** enowaldo has joined #archiveteam-bs [15:00] TinyURL *seems* reliable. I also don't think they are going to shut down anytime soon, but better safe than sorry. [15:08] *** enowaldo has quit IRC (Ping timeout: 252 seconds) [15:13] ATrescue: This is far from surprising. TinyCC lets you edit and delete links if you use an account. Anyway, I'm not sure we need a separate page for each URL shortener. Not sure what others think on this though. [15:14] *** icedice2 has joined #archiveteam-bs [15:15] JAA: Surprisingly many URL's in the past (probably when they redesigned their website at some point) became unuseable. [15:15] *** Zerote has quit IRC (Read error: Operation timed out) [15:16] *** svchfoo1 has joined #archiveteam-bs [15:16] *** PurpleSym sets mode: +o svchfoo1 [15:22] *** icedice has quit IRC (Read error: Operation timed out) [15:39] Yeah, please don't make separate pages for URL shorteners. [15:46] *** bitBaron has joined #archiveteam-bs [15:51] *** icedice2 has quit IRC (Ping timeout: 252 seconds) [15:52] *** Zerote has joined #archiveteam-bs [15:57] *** icedice has joined #archiveteam-bs [16:27] *** enowaldo has joined #archiveteam-bs [16:46] *** bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) [16:49] *** bitBaron has joined #archiveteam-bs [16:55] *** Dj-Wawa has joined #archiveteam-bs [16:56] *** enowaldo has quit IRC (Ping timeout: 265 seconds) [17:14] *** enowaldo has joined #archiveteam-bs [17:27] *** deathy has quit IRC () [17:27] *** deathy has joined #archiveteam-bs [17:50] *** diggan has quit IRC () [17:50] *** diggan has joined #archiveteam-bs [18:04] *** Terbium has quit IRC (Quit: Terbium) [18:11] *** Terbium has joined #archiveteam-bs [18:31] *** enowaldo has quit IRC (Ping timeout: 252 seconds) [18:43] *** enowaldo has joined #archiveteam-bs [18:46] *** bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) [18:55] *** godane has quit IRC (Quit: Leaving.) [19:03] *** godane has joined #archiveteam-bs [19:05] *** ndiddy has joined #archiveteam-bs [19:19] *** bitBaron has joined #archiveteam-bs [19:53] *** wyatt8740 has quit IRC (Read error: Operation timed out) [20:07] *** killsushi has joined #archiveteam-bs [20:08] *** enowaldo has quit IRC (Read error: Operation timed out) [20:12] *** tsp__ has quit IRC (Remote host closed the connection) [20:26] *** godane has quit IRC (Ping timeout: 615 seconds) [20:26] *** tsp__ has joined #archiveteam-bs [20:33] *** godane has joined #archiveteam-bs [20:35] *** enowaldo has joined #archiveteam-bs [20:39] *** ndiddy has quit IRC (Ping timeout: 615 seconds) [20:48] so i got another post by that crazy guy : https://archive.org/details/the-mike-rosen-show-2011-12-30 [20:48] *** revi has quit IRC () [20:48] *** revi has joined #archiveteam-bs [20:50] *** enowaldo has quit IRC (Read error: Operation timed out) [21:04] *** enowaldo has joined #archiveteam-bs [21:28] *** icedice has quit IRC (Read error: Operation timed out) [21:38] *** wyatt8740 has joined #archiveteam-bs [21:51] *** ndiddy has joined #archiveteam-bs [22:07] *** enowaldo has quit IRC (Read error: Operation timed out) [22:08] so i'm uploading 3 things at once today [22:09] The Mike Rosen Show, libuow issuu cbz files, and The Joe Piscopo Show [22:34] *** BlueMax has joined #archiveteam-bs [23:07] *** VerifiedJ has quit IRC (Quit: Leaving) [23:52] *** Rome_Silv has joined #archiveteam-bs [23:58] *** Rome has quit IRC (Read error: Operation timed out)