#archiveteam 2017-03-02,Thu

↑back Search

Time Nickname Message
00:03 🔗 sims has quit IRC (Ping timeout: 268 seconds)
00:36 🔗 odemg has quit IRC (Remote host closed the connection)
00:41 🔗 bsmith093 has quit IRC (Quit: Leaving.)
00:45 🔗 FalconK has joined #archiveteam
00:55 🔗 odemg has joined #archiveteam
00:57 🔗 BlueMaxim has quit IRC (Quit: Leaving)
01:18 🔗 icedice has quit IRC (Ping timeout: 250 seconds)
01:35 🔗 odemg has quit IRC (Remote host closed the connection)
01:53 🔗 Ravenloft has joined #archiveteam
02:05 🔗 alfie has quit IRC (Ping timeout: 260 seconds)
02:06 🔗 BlueMaxim has joined #archiveteam
02:27 🔗 alfie has joined #archiveteam
02:29 🔗 ndiddy has joined #archiveteam
02:34 🔗 alfie has quit IRC (Ping timeout: 244 seconds)
02:39 🔗 antonizoo i guess they've been saying this for a while but Dropbox is disabling the public folder and any shared links made using it on March 15th
02:39 🔗 antonizoo honestly a pretty bad way of handling it since it causes a lot of link rot, as the shared links have to be recreated
02:39 🔗 antonizoo that probably is a majority of dropbox shared links which may not always be recreated
02:45 🔗 MMovie has joined #archiveteam
02:45 🔗 alfie has joined #archiveteam
02:46 🔗 MMovie2 has quit IRC (Read error: Operation timed out)
02:50 🔗 VADemon has quit IRC (Quit: left4dead)
02:52 🔗 kyounko has joined #archiveteam
02:56 🔗 alfie has quit IRC (Ping timeout: 244 seconds)
02:57 🔗 schbirid has quit IRC (Ping timeout: 255 seconds)
02:58 🔗 Ravenloft has quit IRC (Ping timeout: 260 seconds)
02:58 🔗 dserodio has quit IRC (Read error: Connection reset by peer)
02:58 🔗 dserodio has joined #archiveteam
03:05 🔗 dserodio has quit IRC (Read error: Connection reset by peer)
03:07 🔗 dserodio has joined #archiveteam
03:08 🔗 QBcrusher has quit IRC (Ping timeout: 244 seconds)
03:09 🔗 schbirid has joined #archiveteam
03:09 🔗 alfie has joined #archiveteam
03:27 🔗 alfie has quit IRC (Ping timeout: 244 seconds)
03:29 🔗 yyzfp has joined #archiveteam
03:36 🔗 alfie has joined #archiveteam
04:08 🔗 Ravenloft has joined #archiveteam
04:11 🔗 alfie has quit IRC (Ping timeout: 260 seconds)
04:13 🔗 alfie has joined #archiveteam
04:22 🔗 alfie has quit IRC (Ping timeout: 244 seconds)
04:25 🔗 maelstrom has quit IRC (Quit: Leaving)
04:25 🔗 alfie has joined #archiveteam
04:34 🔗 alfie has quit IRC (Ping timeout: 260 seconds)
04:35 🔗 alfie has joined #archiveteam
04:45 🔗 alfie has quit IRC (Ping timeout: 244 seconds)
04:48 🔗 alfie has joined #archiveteam
04:55 🔗 Rondom has quit IRC (Remote host closed the connection)
04:55 🔗 Rondom has joined #archiveteam
05:04 🔗 ndiddy has quit IRC (Read error: Connection reset by peer)
05:07 🔗 Sk1d has joined #archiveteam
05:11 🔗 ravetcofx has joined #archiveteam
05:21 🔗 alfie has quit IRC (Ping timeout: 260 seconds)
05:26 🔗 alfie has joined #archiveteam
05:32 🔗 ravetcofx has quit IRC (Read error: Operation timed out)
05:35 🔗 alfie has quit IRC (Ping timeout: 244 seconds)
05:35 🔗 ravetcofx has joined #archiveteam
05:42 🔗 alfie has joined #archiveteam
06:22 🔗 alfie has quit IRC (Ping timeout: 260 seconds)
06:24 🔗 alfie has joined #archiveteam
06:34 🔗 alfie has quit IRC (Ping timeout: 244 seconds)
06:47 🔗 alfie has joined #archiveteam
07:04 🔗 ZexaronS- has quit IRC (Read error: Connection reset by peer)
07:05 🔗 nrp3c has quit IRC (Read error: Operation timed out)
07:07 🔗 ZexaronS has joined #archiveteam
07:11 🔗 alfie has quit IRC (Ping timeout: 260 seconds)
07:13 🔗 vitzli has joined #archiveteam
07:14 🔗 alfie has joined #archiveteam
07:19 🔗 nrp3c has joined #archiveteam
07:36 🔗 alfie has quit IRC (Ping timeout: 244 seconds)
08:08 🔗 QBcrusher has joined #archiveteam
08:21 🔗 alfie has joined #archiveteam
08:32 🔗 alfie has quit IRC (Ping timeout: 244 seconds)
08:38 🔗 alfie has joined #archiveteam
08:49 🔗 alfie has quit IRC (Ping timeout: 260 seconds)
08:49 🔗 alfie has joined #archiveteam
08:54 🔗 alfie has quit IRC (Ping timeout: 244 seconds)
09:01 🔗 alfie has joined #archiveteam
09:24 🔗 vitzli has quit IRC (Leaving)
09:26 🔗 ravetcofx has quit IRC (Read error: Operation timed out)
09:28 🔗 Asparagir has quit IRC (Read error: Operation timed out)
09:35 🔗 Asparagir has joined #archiveteam
09:36 🔗 kris33 has quit IRC (Textual IRC Client: www.textualapp.com)
10:08 🔗 vcxuoi has joined #archiveteam
10:08 🔗 HCross2 has quit IRC (Quit: Connection closed for inactivity)
10:08 🔗 vcxuoi has quit IRC (Client Quit)
11:13 🔗 pizzaiolo has joined #archiveteam
11:22 🔗 HCross2 has joined #archiveteam
11:28 🔗 BlueMaxim has quit IRC (Quit: Leaving)
11:51 🔗 gibigiana has quit IRC (leaving)
11:52 🔗 gibigiana has joined #archiveteam
11:55 🔗 gibigiana has quit IRC (Remote host closed the connection)
11:56 🔗 gibigiana has joined #archiveteam
11:57 🔗 gibigiana has quit IRC (Remote host closed the connection)
11:58 🔗 gibigiana has joined #archiveteam
12:09 🔗 Silvan has joined #archiveteam
12:12 🔗 dashcloud has quit IRC (Quit: No Ping reply in 180 seconds.)
12:12 🔗 SilSte has quit IRC (Read error: Operation timed out)
12:13 🔗 dashcloud has joined #archiveteam
12:54 🔗 VADemon has joined #archiveteam
13:12 🔗 JSharp___ has quit IRC (Read error: Connection reset by peer)
13:12 🔗 alembic has quit IRC (Read error: Connection reset by peer)
13:13 🔗 JSharp___ has joined #archiveteam
13:13 🔗 alembic has joined #archiveteam
13:23 🔗 passerby has quit IRC ()
13:44 🔗 passerby has joined #archiveteam
14:00 🔗 eightfold has joined #archiveteam
14:28 🔗 mls has quit IRC (Read error: Connection reset by peer)
14:29 🔗 mls has joined #archiveteam
14:36 🔗 eightfold has quit IRC (Ping timeout: 260 seconds)
15:36 🔗 DopefishJ has joined #archiveteam
15:36 🔗 swebb sets mode: +o DopefishJ
15:37 🔗 DFJustin has quit IRC (Ping timeout: 260 seconds)
15:39 🔗 Stilett0 has quit IRC (Read error: Connection reset by peer)
15:46 🔗 Stilett0 has joined #archiveteam
16:16 🔗 eightfold has joined #archiveteam
16:18 🔗 mls has quit IRC (Quit: leaving)
16:26 🔗 DopefishJ is now known as DFJustin
16:54 🔗 eightfold has quit IRC (Ping timeout: 260 seconds)
16:54 🔗 atomotic has joined #archiveteam
16:58 🔗 Asparagir has quit IRC (Read error: Connection reset by peer)
16:59 🔗 Asparagir has joined #archiveteam
17:06 🔗 atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com)
17:07 🔗 BlueMaxim has joined #archiveteam
17:32 🔗 icedice has joined #archiveteam
17:36 🔗 ravetcofx has joined #archiveteam
17:57 🔗 kyounko has quit IRC (Read error: Connection reset by peer)
18:03 🔗 kyounko has joined #archiveteam
18:37 🔗 yyzfp WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD
18:37 🔗 yyzfp I want to make a wiki page for the impending deletion of UC Berkeley course recordings.
18:37 🔗 yyzfp http://news.berkeley.edu/2017/03/01/course-capture/
18:37 🔗 yyzfp http://webcast.berkeley.edu/
18:38 🔗 pizzaiolo :o
18:38 🔗 xmc in your PM
18:38 🔗 xmc yyzfp: ^
18:39 🔗 yyzfp Thanks xmc.
19:40 🔗 tobbez has joined #archiveteam
19:43 🔗 Ravenloft has quit IRC (Ping timeout: 633 seconds)
19:51 🔗 SketchCow OK, we need to make a channel for this
19:51 🔗 SketchCow I've got several people who want to help who are helping
19:51 🔗 SketchCow The current belief is they're not going to delete, just kill the links to the youtube channel
19:53 🔗 j08nY has joined #archiveteam
20:01 🔗 ThisAsYou joepie91: That ia tool is awesome. As soon as I finish grabbing the channel I'll upload it all to https://archive.org/download/UCBerkely-YouTube
20:01 🔗 ThisAsYou As well as a json of all associated metadata and a zip of all the thumbnails
20:04 🔗 joepie91 ThisAsYou: one item per thing please!
20:04 🔗 joepie91 don't dump a ton of stuff into a single item
20:05 🔗 ThisAsYou Ok
20:05 🔗 ThisAsYou Can I make a collection of all the videos?
20:05 🔗 ThisAsYou I want to group them together
20:05 🔗 joepie91 ThisAsYou: yes, but let's move to #archiveteam-bs
20:05 🔗 SketchCow I really super don't want you doing this.
20:06 🔗 SketchCow I in fact super don't want you doing it at all - I'd like coordination so none of the metadata is there before IA backs it up. People should of course back it up, but just dumping it into the archive straight up, don't.
20:06 🔗 ThisAsYou Oh okay
20:06 🔗 ThisAsYou I'll not then
20:07 🔗 joepie91 ThisAsYou: see -bs please
20:07 🔗 joepie91 :P
20:14 🔗 bakabernd has joined #archiveteam
20:17 🔗 bakabernd UC Berkeley will remove their course videos from youtube in a few days: http://news.berkeley.edu/2017/03/01/course-capture/
20:18 🔗 bakabernd https://www.youtube.com/user/UCBerkeley/videos
20:19 🔗 Svekla Did Cloudflare block The Wayback Machine? I can't save any site behind CF
20:21 🔗 atomotic has joined #archiveteam
20:27 🔗 VADemon Some websites enforce strict "antibot" bs protection
20:31 🔗 atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com)
20:42 🔗 bakabernd WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD
20:45 🔗 SketchCow bakabernd: #berklost
20:51 🔗 Svekla VADemon, I don't block any IPs or bots on my site and I still can't save it.
20:51 🔗 Svekla I get "This url is not available on the live web or can not be archived.
20:51 🔗 Svekla "
20:53 🔗 bsmith093 has joined #archiveteam
20:54 🔗 Marcelo has joined #archiveteam
21:10 🔗 maelstrom has joined #archiveteam
21:10 🔗 Stilett0 has quit IRC (Ping timeout: 246 seconds)
21:12 🔗 namespace has quit IRC (Read error: Operation timed out)
21:12 🔗 pnJay has joined #archiveteam
21:16 🔗 Marcelo has left
21:24 🔗 bakabernd has quit IRC (Leaving)
21:36 🔗 pnJay has quit IRC (Quit: Page closed)
21:51 🔗 tpw_rules has quit IRC (Read error: Operation timed out)
21:53 🔗 odemg has joined #archiveteam
21:53 🔗 odemg has quit IRC (Connection closed)
21:54 🔗 odemg has joined #archiveteam
22:01 🔗 Lord_Nigh has quit IRC (Read error: Operation timed out)
22:10 🔗 tpw_rules has joined #archiveteam
22:11 🔗 Lord_Nigh has joined #archiveteam
22:14 🔗 tpw_rules has quit IRC (Read error: Operation timed out)
22:26 🔗 schbirid has quit IRC (Quit: Leaving)
22:33 🔗 tpw_rules has joined #archiveteam
22:38 🔗 icedice has quit IRC (Ping timeout: 244 seconds)
22:43 🔗 icedice has joined #archiveteam
22:50 🔗 scyther Just a reminder that the LEGO message boards are closing down pretty soon!
22:51 🔗 scyther 4 days remaining
22:51 🔗 scyther https://community.lego.com/t5/COMMUNITY-CHAT/Message-Boards-Community-Retirement/m-p/14513705#U14513705
22:51 🔗 scyther A lot of good content
22:52 🔗 pizzaiolo scyther: run an archivebot job
22:52 🔗 pizzaiolo if no one ran one yet
22:53 🔗 scyther i really need to head to go right now, so if anybody who knows what parameters for archiving large forums, could you give it a go?
23:00 🔗 yipdw http://archive.fart.website/archivebot/viewer/job/20hyd there was a job in 2014
23:00 🔗 yipdw and 2017 http://archive.fart.website/archivebot/viewer/job/4xzj9
23:20 🔗 TMM has joined #archiveteam
23:20 🔗 TMM hello
23:21 🔗 TMM I run a small code-sharing site called 'notabug.org' and your bot is causing me some trouble
23:21 🔗 TMM I have caches for requests for diffs of recent changesets but not of older ones
23:21 🔗 TMM the bot is requesting all changesets and multiple at once it seems
23:21 🔗 TMM Since your bot advertises it ignores robots.txt I have no way of telling it to slow down
23:23 🔗 TMM I'm all for archiving the internet, so, is there anything I can do to make it play nice since the normal mechanisms for this appear to not be implemented?
23:27 🔗 odemg has quit IRC (Remote host closed the connection)
23:29 🔗 Sanqui TMM: dunno if it has reached you or not, so just so you know, the job was aborted.
23:29 🔗 TMM Sanqui, I'd also like the job to not be able to be restarted, or at least have some url patterns the bot just won't ever crawl
23:30 🔗 TMM Sanqui, there's no point in downloading all the changesets, as those are in git already anyway (feel free to archive all the git repositories themselves btw)
23:30 🔗 TMM Sanqui, but generating the html diffs for the old commits is just hugely taxing for me
23:31 🔗 Sanqui TMM: yeah, I understand and agree
23:32 🔗 Sanqui I just can't make you any promises myself
23:32 🔗 TMM I can hang around, for now I've taken some steps so that the archivebot gets 403s
23:33 🔗 Sanqui whoever put a git hosting website into archivebot made a mistake, tbh
23:33 🔗 Sanqui they should be more careful.
23:34 🔗 Sanqui sorry for the trouble.
23:34 🔗 topher has joined #archiveteam
23:36 🔗 topher has anyone seen http://news.berkeley.edu/2017/03/01/course-capture/
23:36 🔗 TMM Sanqui, it's alright, but it'd be nice to get it fixed. my current 'sledgehammer' approach is a little much
23:36 🔗 TMM I don't want to have to make changeset viewing a logged-in user only activity
23:37 🔗 Sanqui TMM: archivebot works on an on-demand basis. so, best case scenario, nobody will put notabug.org in again, or they will do so with a sane ignore set
23:38 🔗 Sanqui i'd say it's likely it won't be put in again.
23:38 🔗 Sanqui (unless you announce a shutdown or anything)
23:38 🔗 TMM that's not currently planned :)
23:39 🔗 TMM well, I guess if nobody is going to put it back in I'll just leave the blanked block in place
23:39 🔗 LastNinja has quit IRC (Ping timeout: 245 seconds)
23:40 🔗 topher has quit IRC (Quit: Page closed)
23:42 🔗 Sanqui yeah, should be alright. sorry again and thanks for stopping by!
23:42 🔗 TMM thanks for replying :)
23:42 🔗 jtn2 TMM: do you know bill-auge? They were here a few days ago and said they helped with notabug, and got here via pizzaiolo. Maybe one of them knows about the archivebot job?
23:43 🔗 TMM jtn2, yeah, it was pizzaiolo
23:43 🔗 pizzaiolo jtn2: yep, I started the job without realizing the full consequences :)
23:44 🔗 TMM It'd sure be a lot better if archivebot would at *least* implement some way of telling it how long it has to wait between requests
23:44 🔗 TMM I think it's pretty dickish to just ignore robots.txt to begin with, but I can kind of understand you don't want to limit what you crawl
23:44 🔗 TMM but at least limit how fast you crawl if a robots.txt asks for it
23:50 🔗 Aranje has quit IRC (Quit: Three sheets to the wind)
23:56 🔗 Aranje has joined #archiveteam

irclogger-viewer