[00:09] more digitize tapes: https://www.patreon.com/posts/digitize-tapes-17385718 [00:10] *** Mayonaise has joined #archiveteam-bs [00:22] *** alex___ has quit IRC (take care ye all. Have fun!) [00:41] *** tuluu has quit IRC (Read error: Operation timed out) [00:43] *** tuluu has joined #archiveteam-bs [00:52] this is educational as a survey of dead image hosts [00:52] imageftw.com, majhost.com, myfrogbag... [00:57] I'll never get why people don't just use imgur [00:58] Imgur sucks. [00:58] powerKitt: myfrogbag is notable in that it accepted Flash uploads [00:59] There was also swfbin [00:59] oh right does anyone know how to rewrite old imageshack URLs to working ones [00:59] take http://img823.imageshack.us/img823/8941/fa101.gif as an example [00:59] it died and took several SWFs for the Homestuck ARG "SBARG" with it. The creator has disclosed to me that he no longer has the source FLAs or indeed any copies. [01:01] riking: you can't. [01:01] Imageshack have already deleted that image. [01:01] ok, I'll just let the 'pull from wayback' do its thing then [01:02] Imageshack is an incredibly awful image host, since they just delete old images without warning to free up space. [01:02] it's getting a 60-80% hit rate [01:02] https://meta.stackexchange.com/questions/263771/ban-imageshack-images-because-they-are-reusing-old-urls-for-advertising in fact, if this stackexchange report is to be believed, they actually used to replace old images with advertising. [01:03] hoooo weee [01:03] have you tried contacting the original authors of stories with missing images, by the way [01:03] not yet, there's a couple I definitely know will be possible [01:04] I've been keeping a notes file with completely failing stories https://github.com/riking/mspfa-archiver/blob/master/wip [01:05] oh right I was thinking about contact as primary recovery for dropbox URLs [01:06] the files almost certainly still exist! just not at public urls [01:11] I do feel a little guilty for copying the metadata over straight https://archive.org/search.php?query=creator%3A%22a+crapload+of+people%22 [01:12] but I'm not sure what else I'm supposed to do when the "author" field is free text [01:14] *** powerKitt has quit IRC (Remote host closed the connection) [02:00] *** Pixi has quit IRC (Ping timeout: 255 seconds) [02:01] *** Pixi has joined #archiveteam-bs [02:49] Kaz: what kind of archivable site would #archivebot be good/bad for? [02:49] Kaz: (sorry for not responding earlier, i got busy) [03:22] *** ld1 has quit IRC (Quit: ld1) [03:25] *** ld1 has joined #archiveteam-bs [03:26] *** jacketcha has quit IRC (Read error: Operation timed out) [03:36] *** dashcloud has joined #archiveteam-bs [03:53] *** fie has quit IRC (Ping timeout: 252 seconds) [04:07] *** fie has joined #archiveteam-bs [04:10] *** qw3rty114 has joined #archiveteam-bs [04:16] *** qw3rty113 has quit IRC (Read error: Operation timed out) [04:44] *** zyphlar_ has joined #archiveteam-bs [05:23] *** odemg has quit IRC (Read error: Connection reset by peer) [05:35] *** odemg has joined #archiveteam-bs [06:20] *** tuluu_ has joined #archiveteam-bs [06:22] *** wacky has quit IRC (Read error: Operation timed out) [06:22] *** ppsym has joined #archiveteam-bs [06:25] *** decay_ has joined #archiveteam-bs [06:27] *** HCross2_ has joined #archiveteam-bs [06:28] *** tuluu has quit IRC (se.hub irc.underworld.no) [06:28] *** i0npulse has quit IRC (se.hub irc.underworld.no) [06:28] *** Jens has quit IRC (se.hub irc.underworld.no) [06:28] *** purplebot has quit IRC (se.hub irc.underworld.no) [06:28] *** PurpleSym has quit IRC (se.hub irc.underworld.no) [06:28] *** medowar has quit IRC (se.hub irc.underworld.no) [06:28] *** HCross2 has quit IRC (se.hub irc.underworld.no) [06:28] *** decay has quit IRC (se.hub irc.underworld.no) [06:30] *** HCross2_ is now known as HCross [06:44] *** ppsym is now known as PurpleSym [07:24] *** zyphlar_ has quit IRC (Quit: Connection closed for inactivity) [07:45] *** wp494_ has joined #archiveteam-bs [07:52] *** wp494 has quit IRC (Read error: Operation timed out) [07:54] *** wp494 has joined #archiveteam-bs [07:54] *** wp494_ has quit IRC (Ping timeout: 492 seconds) [08:33] *** Mateon1 has quit IRC (Read error: Operation timed out) [08:33] *** Mateon1 has joined #archiveteam-bs [08:58] *** dashcloud has quit IRC (Read error: Operation timed out) [09:01] *** dashcloud has joined #archiveteam-bs [09:34] *** schbirid has joined #archiveteam-bs [09:56] *** MrRadar has quit IRC (Read error: Operation timed out) [10:55] we have hbo first look at major league 2 [11:04] *** BlueMax has quit IRC (Leaving) [11:06] *** MrRadar has joined #archiveteam-bs [11:33] *** i0npulse has joined #archiveteam-bs [11:33] *** purplebot has joined #archiveteam-bs [12:34] *** odemg has quit IRC (Read error: Operation timed out) [12:44] *** odemg has joined #archiveteam-bs [13:49] *** odemg has quit IRC (Read error: Operation timed out) [14:16] *** odemg has joined #archiveteam-bs [15:43] *** Sk1d has quit IRC (Ping timeout: 250 seconds) [16:11] An update on Charlie Rose: about 1000 of the 5400 remaining videos are done in 41 hours (plus some idling time because the disk was full). So I expect it to finish approximately on Pi Day. [16:31] *** tsr has quit IRC (Ping timeout: 244 seconds) [16:33] *** BnARobin has quit IRC (Read error: Operation timed out) [16:34] *** BnARobin has joined #archiveteam-bs [16:36] blogged to seek advice on running an archive.org-a-like https://jmtd.net/log/self_hosted_archive.org/ [16:40] Jon: Ah, right, you asked about that some days ago. I came across https://github.com/Kickball/awesome-selfhosted yesterday, which has an "Archiving and Digital Preservation" section of FOSS for self-hosting. I didn't look at it in detail, but maybe there's something in there for you. [16:41] *** tsr has joined #archiveteam-bs [16:42] Jon: which aspects of archive.org are the important ones for you there? [16:42] there is some software for museums/libraries but i cannot remember its name right now [16:44] "artifacts" might have been a related term, dunno [17:05] *** odemg has quit IRC (Read error: Operation timed out) [17:06] *** schbirid has quit IRC (Quit: Leaving) [17:13] *** odemg has joined #archiveteam-bs [17:20] @jon: Have you seen https://github.com/internetarchive/wayback? [17:22] All depends on if you want to store items (and then have those items derived), or store WARCs and represent them with a viewer. [17:22] Blob storage is the easy part! [17:24] bithippo: I believe he's looking for something similar to IA itself (items, collections, etc.), not the Wayback Machine. [17:24] Sorry for the noise then! [17:38] hmm. should I be using something that's not the wiki to track to-dos [17:38] previously I was using a text file checked into git and that's not going to Scale ™️ [17:38] to, uh, "more than just me" [17:40] I've been wondering for a while whether it would make sense to set up an AT issue tracker. [17:41] But there would be a significant overlap with the wiki, and we probably want to avoid splitting the information across multiple platforms. [17:42] eeeeuh [17:52] Are there any issue trackers with date-based deadlines? I.e. “website X is going down next monday”. [17:54] Yeah, many of them support a "due date" setting, I believe. [17:55] I really like Github Issues for these sorts of things, but I totally understand if that's not the AT "way" [17:55] I would be amiable to running a hosted Gitlab issue tracker, or paying annually for one if the amount was reasonable [18:15] i switched to a spreadsheet for failed-pull tracking and it's definitely better than repeatedly editing the wiki page [18:58] *** medowar has joined #archiveteam-bs [19:12] *** BnARobin_ has joined #archiveteam-bs [19:13] *** BnARobin has quit IRC (Read error: Operation timed out) [19:17] *** balrog has quit IRC (Read error: Operation timed out) [19:33] *** balrog has joined #archiveteam-bs [19:33] *** svchfoo1 sets mode: +o balrog [19:45] *** fie has quit IRC (Ping timeout: 360 seconds) [19:56] *** fie has joined #archiveteam-bs [20:25] *** jschwart has joined #archiveteam-bs [20:36] *** Jens has joined #archiveteam-bs [21:31] *** BlueMax has joined #archiveteam-bs [21:34] *** Jusque has quit IRC (Ping timeout: 492 seconds) [21:36] *** Jusque has joined #archiveteam-bs [21:42] *** jschwart has quit IRC (Quit: Konversation terminated!) [21:59] *** dashcloud has quit IRC (Read error: Operation timed out) [22:02] *** dashcloud has joined #archiveteam-bs [22:44] *** rsznik has quit IRC (Read error: Connection reset by peer) [22:48] *** rsznik has joined #archiveteam-bs [22:49] *** ndiddy has joined #archiveteam-bs [22:50] *** ndiddy has quit IRC (Client Quit) [22:50] *** ndiddy has joined #archiveteam-bs [23:56] *** RichardG has quit IRC (Read error: Connection reset by peer) [23:59] *** RichardG has joined #archiveteam-bs