[00:03] *** sims has quit IRC (Ping timeout: 268 seconds) [00:36] *** odemg has quit IRC (Remote host closed the connection) [00:41] *** bsmith093 has quit IRC (Quit: Leaving.) [00:45] *** FalconK has joined #archiveteam [00:55] *** odemg has joined #archiveteam [00:57] *** BlueMaxim has quit IRC (Quit: Leaving) [01:18] *** icedice has quit IRC (Ping timeout: 250 seconds) [01:35] *** odemg has quit IRC (Remote host closed the connection) [01:53] *** Ravenloft has joined #archiveteam [02:05] *** alfie has quit IRC (Ping timeout: 260 seconds) [02:06] *** BlueMaxim has joined #archiveteam [02:27] *** alfie has joined #archiveteam [02:29] *** ndiddy has joined #archiveteam [02:34] *** alfie has quit IRC (Ping timeout: 244 seconds) [02:39] i guess they've been saying this for a while but Dropbox is disabling the public folder and any shared links made using it on March 15th [02:39] honestly a pretty bad way of handling it since it causes a lot of link rot, as the shared links have to be recreated [02:39] that probably is a majority of dropbox shared links which may not always be recreated [02:45] *** MMovie has joined #archiveteam [02:45] *** alfie has joined #archiveteam [02:46] *** MMovie2 has quit IRC (Read error: Operation timed out) [02:50] *** VADemon has quit IRC (Quit: left4dead) [02:52] *** kyounko has joined #archiveteam [02:56] *** alfie has quit IRC (Ping timeout: 244 seconds) [02:57] *** schbirid has quit IRC (Ping timeout: 255 seconds) [02:58] *** Ravenloft has quit IRC (Ping timeout: 260 seconds) [02:58] *** dserodio has quit IRC (Read error: Connection reset by peer) [02:58] *** dserodio has joined #archiveteam [03:05] *** dserodio has quit IRC (Read error: Connection reset by peer) [03:07] *** dserodio has joined #archiveteam [03:08] *** QBcrusher has quit IRC (Ping timeout: 244 seconds) [03:09] *** schbirid has joined #archiveteam [03:09] *** alfie has joined #archiveteam [03:27] *** alfie has quit IRC (Ping timeout: 244 seconds) [03:29] *** yyzfp has joined #archiveteam [03:36] *** alfie has joined #archiveteam [04:08] *** Ravenloft has joined #archiveteam [04:11] *** alfie has quit IRC (Ping timeout: 260 seconds) [04:13] *** alfie has joined #archiveteam [04:22] *** alfie has quit IRC (Ping timeout: 244 seconds) [04:25] *** maelstrom has quit IRC (Quit: Leaving) [04:25] *** alfie has joined #archiveteam [04:34] *** alfie has quit IRC (Ping timeout: 260 seconds) [04:35] *** alfie has joined #archiveteam [04:45] *** alfie has quit IRC (Ping timeout: 244 seconds) [04:48] *** alfie has joined #archiveteam [04:55] *** Rondom has quit IRC (Remote host closed the connection) [04:55] *** Rondom has joined #archiveteam [05:04] *** ndiddy has quit IRC (Read error: Connection reset by peer) [05:07] *** Sk1d has joined #archiveteam [05:11] *** ravetcofx has joined #archiveteam [05:21] *** alfie has quit IRC (Ping timeout: 260 seconds) [05:26] *** alfie has joined #archiveteam [05:32] *** ravetcofx has quit IRC (Read error: Operation timed out) [05:35] *** alfie has quit IRC (Ping timeout: 244 seconds) [05:35] *** ravetcofx has joined #archiveteam [05:42] *** alfie has joined #archiveteam [06:22] *** alfie has quit IRC (Ping timeout: 260 seconds) [06:24] *** alfie has joined #archiveteam [06:34] *** alfie has quit IRC (Ping timeout: 244 seconds) [06:47] *** alfie has joined #archiveteam [07:04] *** ZexaronS- has quit IRC (Read error: Connection reset by peer) [07:05] *** nrp3c has quit IRC (Read error: Operation timed out) [07:07] *** ZexaronS has joined #archiveteam [07:11] *** alfie has quit IRC (Ping timeout: 260 seconds) [07:13] *** vitzli has joined #archiveteam [07:14] *** alfie has joined #archiveteam [07:19] *** nrp3c has joined #archiveteam [07:36] *** alfie has quit IRC (Ping timeout: 244 seconds) [08:08] *** QBcrusher has joined #archiveteam [08:21] *** alfie has joined #archiveteam [08:32] *** alfie has quit IRC (Ping timeout: 244 seconds) [08:38] *** alfie has joined #archiveteam [08:49] *** alfie has quit IRC (Ping timeout: 260 seconds) [08:49] *** alfie has joined #archiveteam [08:54] *** alfie has quit IRC (Ping timeout: 244 seconds) [09:01] *** alfie has joined #archiveteam [09:24] *** vitzli has quit IRC (Leaving) [09:26] *** ravetcofx has quit IRC (Read error: Operation timed out) [09:28] *** Asparagir has quit IRC (Read error: Operation timed out) [09:35] *** Asparagir has joined #archiveteam [09:36] *** kris33 has quit IRC (Textual IRC Client: www.textualapp.com) [10:08] *** vcxuoi has joined #archiveteam [10:08] *** HCross2 has quit IRC (Quit: Connection closed for inactivity) [10:08] *** vcxuoi has quit IRC (Client Quit) [11:13] *** pizzaiolo has joined #archiveteam [11:22] *** HCross2 has joined #archiveteam [11:28] *** BlueMaxim has quit IRC (Quit: Leaving) [11:51] *** gibigiana has quit IRC (leaving) [11:52] *** gibigiana has joined #archiveteam [11:55] *** gibigiana has quit IRC (Remote host closed the connection) [11:56] *** gibigiana has joined #archiveteam [11:57] *** gibigiana has quit IRC (Remote host closed the connection) [11:58] *** gibigiana has joined #archiveteam [12:09] *** Silvan has joined #archiveteam [12:12] *** dashcloud has quit IRC (Quit: No Ping reply in 180 seconds.) [12:12] *** SilSte has quit IRC (Read error: Operation timed out) [12:13] *** dashcloud has joined #archiveteam [12:54] *** VADemon has joined #archiveteam [13:12] *** JSharp___ has quit IRC (Read error: Connection reset by peer) [13:12] *** alembic has quit IRC (Read error: Connection reset by peer) [13:13] *** JSharp___ has joined #archiveteam [13:13] *** alembic has joined #archiveteam [13:23] *** passerby has quit IRC () [13:44] *** passerby has joined #archiveteam [14:00] *** eightfold has joined #archiveteam [14:28] *** mls has quit IRC (Read error: Connection reset by peer) [14:29] *** mls has joined #archiveteam [14:36] *** eightfold has quit IRC (Ping timeout: 260 seconds) [15:36] *** DopefishJ has joined #archiveteam [15:36] *** swebb sets mode: +o DopefishJ [15:37] *** DFJustin has quit IRC (Ping timeout: 260 seconds) [15:39] *** Stilett0 has quit IRC (Read error: Connection reset by peer) [15:46] *** Stilett0 has joined #archiveteam [16:16] *** eightfold has joined #archiveteam [16:18] *** mls has quit IRC (Quit: leaving) [16:26] *** DopefishJ is now known as DFJustin [16:54] *** eightfold has quit IRC (Ping timeout: 260 seconds) [16:54] *** atomotic has joined #archiveteam [16:58] *** Asparagir has quit IRC (Read error: Connection reset by peer) [16:59] *** Asparagir has joined #archiveteam [17:06] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [17:07] *** BlueMaxim has joined #archiveteam [17:32] *** icedice has joined #archiveteam [17:36] *** ravetcofx has joined #archiveteam [17:57] *** kyounko has quit IRC (Read error: Connection reset by peer) [18:03] *** kyounko has joined #archiveteam [18:37] WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD [18:37] I want to make a wiki page for the impending deletion of UC Berkeley course recordings. [18:37] http://news.berkeley.edu/2017/03/01/course-capture/ [18:37] http://webcast.berkeley.edu/ [18:38] :o [18:38] in your PM [18:38] yyzfp: ^ [18:39] Thanks xmc. [19:40] *** tobbez has joined #archiveteam [19:43] *** Ravenloft has quit IRC (Ping timeout: 633 seconds) [19:51] OK, we need to make a channel for this [19:51] I've got several people who want to help who are helping [19:51] The current belief is they're not going to delete, just kill the links to the youtube channel [19:53] *** j08nY has joined #archiveteam [20:01] joepie91: That ia tool is awesome. As soon as I finish grabbing the channel I'll upload it all to https://archive.org/download/UCBerkely-YouTube [20:01] As well as a json of all associated metadata and a zip of all the thumbnails [20:04] ThisAsYou: one item per thing please! [20:04] don't dump a ton of stuff into a single item [20:05] Ok [20:05] Can I make a collection of all the videos? [20:05] I want to group them together [20:05] ThisAsYou: yes, but let's move to #archiveteam-bs [20:05] I really super don't want you doing this. [20:06] I in fact super don't want you doing it at all - I'd like coordination so none of the metadata is there before IA backs it up. People should of course back it up, but just dumping it into the archive straight up, don't. [20:06] Oh okay [20:06] I'll not then [20:07] ThisAsYou: see -bs please [20:07] :P [20:14] *** bakabernd has joined #archiveteam [20:17] UC Berkeley will remove their course videos from youtube in a few days: http://news.berkeley.edu/2017/03/01/course-capture/ [20:18] https://www.youtube.com/user/UCBerkeley/videos [20:19] Did Cloudflare block The Wayback Machine? I can't save any site behind CF [20:21] *** atomotic has joined #archiveteam [20:27] Some websites enforce strict "antibot" bs protection [20:31] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [20:42] WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD [20:45] bakabernd: #berklost [20:51] VADemon, I don't block any IPs or bots on my site and I still can't save it. [20:51] I get "This url is not available on the live web or can not be archived. [20:51] " [20:53] *** bsmith093 has joined #archiveteam [20:54] *** Marcelo has joined #archiveteam [21:10] *** maelstrom has joined #archiveteam [21:10] *** Stilett0 has quit IRC (Ping timeout: 246 seconds) [21:12] *** namespace has quit IRC (Read error: Operation timed out) [21:12] *** pnJay has joined #archiveteam [21:16] *** Marcelo has left [21:24] *** bakabernd has quit IRC (Leaving) [21:36] *** pnJay has quit IRC (Quit: Page closed) [21:51] *** tpw_rules has quit IRC (Read error: Operation timed out) [21:53] *** odemg has joined #archiveteam [21:53] *** odemg has quit IRC (Connection closed) [21:54] *** odemg has joined #archiveteam [22:01] *** Lord_Nigh has quit IRC (Read error: Operation timed out) [22:10] *** tpw_rules has joined #archiveteam [22:11] *** Lord_Nigh has joined #archiveteam [22:14] *** tpw_rules has quit IRC (Read error: Operation timed out) [22:26] *** schbirid has quit IRC (Quit: Leaving) [22:33] *** tpw_rules has joined #archiveteam [22:38] *** icedice has quit IRC (Ping timeout: 244 seconds) [22:43] *** icedice has joined #archiveteam [22:50] Just a reminder that the LEGO message boards are closing down pretty soon! [22:51] 4 days remaining [22:51] https://community.lego.com/t5/COMMUNITY-CHAT/Message-Boards-Community-Retirement/m-p/14513705#U14513705 [22:51] A lot of good content [22:52] scyther: run an archivebot job [22:52] if no one ran one yet [22:53] i really need to head to go right now, so if anybody who knows what parameters for archiving large forums, could you give it a go? [23:00] http://archive.fart.website/archivebot/viewer/job/20hyd there was a job in 2014 [23:00] and 2017 http://archive.fart.website/archivebot/viewer/job/4xzj9 [23:20] *** TMM has joined #archiveteam [23:20] hello [23:21] I run a small code-sharing site called 'notabug.org' and your bot is causing me some trouble [23:21] I have caches for requests for diffs of recent changesets but not of older ones [23:21] the bot is requesting all changesets and multiple at once it seems [23:21] Since your bot advertises it ignores robots.txt I have no way of telling it to slow down [23:23] I'm all for archiving the internet, so, is there anything I can do to make it play nice since the normal mechanisms for this appear to not be implemented? [23:27] *** odemg has quit IRC (Remote host closed the connection) [23:29] TMM: dunno if it has reached you or not, so just so you know, the job was aborted. [23:29] Sanqui, I'd also like the job to not be able to be restarted, or at least have some url patterns the bot just won't ever crawl [23:30] Sanqui, there's no point in downloading all the changesets, as those are in git already anyway (feel free to archive all the git repositories themselves btw) [23:30] Sanqui, but generating the html diffs for the old commits is just hugely taxing for me [23:31] TMM: yeah, I understand and agree [23:32] I just can't make you any promises myself [23:32] I can hang around, for now I've taken some steps so that the archivebot gets 403s [23:33] whoever put a git hosting website into archivebot made a mistake, tbh [23:33] they should be more careful. [23:34] sorry for the trouble. [23:34] *** topher has joined #archiveteam [23:36] has anyone seen http://news.berkeley.edu/2017/03/01/course-capture/ [23:36] Sanqui, it's alright, but it'd be nice to get it fixed. my current 'sledgehammer' approach is a little much [23:36] I don't want to have to make changeset viewing a logged-in user only activity [23:37] TMM: archivebot works on an on-demand basis. so, best case scenario, nobody will put notabug.org in again, or they will do so with a sane ignore set [23:38] i'd say it's likely it won't be put in again. [23:38] (unless you announce a shutdown or anything) [23:38] that's not currently planned :) [23:39] well, I guess if nobody is going to put it back in I'll just leave the blanked block in place [23:39] *** LastNinja has quit IRC (Ping timeout: 245 seconds) [23:40] *** topher has quit IRC (Quit: Page closed) [23:42] yeah, should be alright. sorry again and thanks for stopping by! [23:42] thanks for replying :) [23:42] TMM: do you know bill-auge? They were here a few days ago and said they helped with notabug, and got here via pizzaiolo. Maybe one of them knows about the archivebot job? [23:43] jtn2, yeah, it was pizzaiolo [23:43] jtn2: yep, I started the job without realizing the full consequences :) [23:44] It'd sure be a lot better if archivebot would at *least* implement some way of telling it how long it has to wait between requests [23:44] I think it's pretty dickish to just ignore robots.txt to begin with, but I can kind of understand you don't want to limit what you crawl [23:44] but at least limit how fast you crawl if a robots.txt asks for it [23:50] *** Aranje has quit IRC (Quit: Three sheets to the wind) [23:56] *** Aranje has joined #archiveteam