[00:29] *** BlueMax has joined #archiveteam-bs [01:35] *** Muad-Dib has quit IRC (Ping timeout: 260 seconds) [01:41] *** Muad-Dib has joined #archiveteam-bs [01:42] *** svchfoo1 sets mode: +o Muad-Dib [02:02] *** Flashfire has quit IRC (Quit: Connection closed for inactivity) [03:01] *** Flashfire has joined #archiveteam-bs [03:18] *** wp494 has quit IRC (Ping timeout: 255 seconds) [03:18] *** wp494 has joined #archiveteam-bs [03:28] *** DFJustin has quit IRC (Quit: IMHOSTFU) [03:29] *** Arrhenius has joined #archiveteam-bs [03:29] evening [03:33] is there any word on the Tindeck progress? [03:33] *** DFJustin has joined #archiveteam-bs [03:33] *** swebb sets mode: +o DFJustin [03:52] *** odemg has quit IRC (Ping timeout: 268 seconds) [03:54] *** Arrhenius has quit IRC (Ping timeout: 633 seconds) [04:03] *** odemg has joined #archiveteam-bs [04:56] *** Mateon1 has quit IRC (Ping timeout: 252 seconds) [04:56] *** Mateon1 has joined #archiveteam-bs [05:11] *** RichardG has quit IRC (Read error: Connection reset by peer) [05:12] *** RichardG has joined #archiveteam-bs [06:55] latest digitize tapes: https://www.patreon.com/posts/digitize-tapes-20019351 [07:04] SketchCow: i may have hit gold [07:04] there is a episode of Computer Chronicles talking about copyright law from 1989 [07:04] IA has the episode from 1985 [07:05] so this is a different episode that i think is from june 1990 [07:07] *** kiska has quit IRC (Read error: Operation timed out) [07:08] so based on one of the guys that re-encoding of season 7 of computer chronicles and tvdb [07:08] it was S07E10 and it was not available [07:26] *** Atom-- has joined #archiveteam-bs [07:29] *** ta9le has joined #archiveteam-bs [07:30] *** Atom has quit IRC (Read error: Operation timed out) [07:55] *** Flashfir_ has joined #archiveteam-bs [07:55] *** Flashfir_ has quit IRC (Client Quit) [07:56] *** Flashfire has quit IRC () [07:56] *** Flashfire has joined #archiveteam-bs [08:03] *** kiska has joined #archiveteam-bs [08:10] *** Stiletto has joined #archiveteam-bs [08:10] *** Flashfire has quit IRC (Quit: Bye) [08:12] *** Flashfire has joined #archiveteam-bs [08:12] *** Stilett0 has quit IRC (Read error: Operation timed out) [08:34] *** ReimuHaku has quit IRC (Quit: http://quassel-irc.org - Chat comfortably. Anywhere.) [08:46] *** ReimuHaku has joined #archiveteam-bs [09:10] *** Dimtree has quit IRC (Read error: Operation timed out) [09:33] *** Dimtree has joined #archiveteam-bs [10:00] *** Sk1d has joined #archiveteam-bs [10:02] *** BlueMax has quit IRC (Leaving) [10:29] *** SilSte has quit IRC (Read error: Operation timed out) [10:31] *** SilSte has joined #archiveteam-bs [10:33] Holy hell do I have a lot of coverdiscs and warez discs I'd like to upload [10:33] including a game that's literally nowhere to be found on the internets, as far as I'm aware [10:36] What game? [10:37] I am going to ask on Future of Forums discord if there are any more icyboards forum urls for us to ingest into archivebot [10:38] Unless someone else has done so [10:40] eientei95, Kotobuki Panel DE Multi [10:40] Google only spews out, like, three mentions of that game, as far as I'm aware [10:40] I'll upload it on IA soon-ish, I just need to know whether the executable works with Windows 3.1 [10:44] *** Sk1d has quit IRC (Ping timeout: 260 seconds) [10:46] Just upload it lol [10:47] I will find a work around [10:48] Except I'm busy uploading the compilation as of now [10:48] Ok [10:48] it's 2 CDs, so it's gonna take a looooooong while [10:48] plus I'd need to add a bunch of scans soon enough [10:49] *** Sk1d has joined #archiveteam-bs [10:56] *** Flashfire has quit IRC (Quit: Bye) [10:58] ...okay, so it's just Win95 minimum [10:58] coast clear [11:35] So the Runes of Magic forums are shutting down as someone mentioned a few days ago. It turns out there are several of these. So far, I've come across http://board.eu.runesofmagic.gameforge.com/, http://board.pl.runesofmagic.gameforge.com/, and http://board.us.runesofmagic.gameforge.com/. Can someone check if there are more of these please? [11:41] https://archive.org/details/PanelDeMulti [11:44] Oh yeah, and I also have a bunch of cover discs from UK magazines, with a few DVDs and a few audio CDs [11:57] ta9ie: what do you have for cover discs? [12:00] They're mostly Russian and Ukrainian, but I've also got a few from The Sun, The Star and The Mirror newspapers [12:00] one Czech coverdisc, too [12:01] also, a couple of coverdiscs from a special edition of the magazine that otherwise has full coverage on IA [12:05] so I'm pretty sure I've got *some* exclusive stuff to share [12:10] JAA: unlikely that there are any more boards. the new forum lists EN, DE, FR, PL and ES [12:17] *** jut has quit IRC (Leaving) [12:19] *** jut has joined #archiveteam-bs [12:26] JAA: There's http://board.it.runesofmagic.gameforge.com/ which is just a redirect [12:28] ...and the eu board crashed [12:29] Yup [12:29] Interesting, the ArchiveBot job is still getting 200s. [12:30] Yeah, I came across the Italian link somewhere as well I think. [12:30] Yeah, because the IA job has a valid session in their MSQL db it seems [12:30] ANy new connections fail [12:30] au and de subdomains also exist, but they don't have a board subdomain. [12:31] Ah yeah, "The table 'wcf1_session' is full" [12:31] au is included in us forum, de in eu forum [12:31] Yep [12:32] I wonder if the ArchiveBot job filled the session table. [12:33] JAA: Checked both current ISO 3166-1 alpha-2 codes as well as ISO 639-1 codes and no new board links [12:33] eientei95: Sweet, thanks. [12:33] np [12:33] Aite, I'll fill it in a bit later https://archive.org/details/Legan400Classics\ [12:33] Found out that Network Manager added an invalid IPv6 entry into resolv.conf so that dig commands don't work >.> [13:03] ...oh, okay, I've just realized I put my uploads in Community Texts and apparently I can't change the category at all [13:04] what do [13:04] ta9le: Email Jason or info@archive.org to have it moved to the right collection. [13:04] Oof, alright [13:04] thanks [13:07] Yeah, it's a known bug/missing feature that uploaders can move items between collections they have upload rights to. [13:45] *** robogoat_ has quit IRC (Read error: Operation timed out) [13:47] *** robogoat has joined #archiveteam-bs [14:33] *** Silasqwer has joined #archiveteam-bs [14:38] *** Silasqwer has quit IRC (Quit: http://chat.efnet.org (Ping timeout)) [14:49] so the current tape i'm doing is another 8 hour tape [14:50] i think from 1992-01 cause there was ad for hot shots coming out jan 30th [14:50] since hot shots came out in july 1991 i figure home release came out in jan 1992 [15:29] *** sep332 has joined #archiveteam-bs [16:07] *** djsundog has joined #archiveteam-bs [16:30] The ArchiveBot jobs for the EU and US/AU Runes of Magic forums seem to have been banned. [16:31] Presumably related to the session issue we saw four hours ago. [16:33] *** SilSte has quit IRC (Read error: Operation timed out) [16:34] *** SilSte has joined #archiveteam-bs [16:55] JAA so are all the pipelines banned? [16:56] kiska: Haven't checked. [16:57] We could juggle pipelines until they are all banned [16:57] Well, restarting the job over and over isn't going to get us much additional content though. [16:58] We can't migrate a job between pipelines. [16:58] Hrm interesting [16:58] make it a warior job? [16:58] That would probably work for the EU forums, but not for the US/AU ones. [16:59] The former uses simple canonical URLs index.php?page=Thread&threadID=1 etc. [16:59] The US/AU forums (and the PL ones, which we grabbed entirely already) uses some /board1-subforum/board3-subsubforum/1234-threadslug/ URL which isn't really predictable. [17:00] Also, we'll likely trigger the same issues again if we use the warrior. Their forum software just sucks. [17:00] We have <24 hrs I presume before they disappear [17:00] Yeah [17:01] We can always run a discovery for a few hours before switching to grabbing them? [17:02] I just had a look and oh my........ [17:22] *** schbirid has joined #archiveteam-bs [17:28] *** Igloo has quit IRC (west.us.hub irc.Prison.NET) [17:28] *** achip has quit IRC (west.us.hub irc.Prison.NET) [17:30] *** Igloo_ has joined #archiveteam-bs [17:59] *** achip has joined #archiveteam-bs [18:28] *** jschwart has joined #archiveteam-bs [19:00] *** jdude104 has joined #archiveteam-bs [19:05] *** jdude104 has quit IRC (Client Quit) [19:24] *** K4k_ has joined #archiveteam-bs [19:25] *** JonimusP has joined #archiveteam-bs [19:25] *** swebb sets mode: +o JonimusP [19:27] *** antomati_ has joined #archiveteam-bs [19:27] *** swebb sets mode: +o antomati_ [19:27] *** wacky_ has joined #archiveteam-bs [19:28] *** decay_ has joined #archiveteam-bs [19:28] *** tuluu_ has joined #archiveteam-bs [19:29] *** espes___ has joined #archiveteam-bs [19:29] *** espes__ has quit IRC (Ping timeout: 252 seconds) [19:30] *** RichardG_ has joined #archiveteam-bs [19:30] *** RichardG has quit IRC (se.hub irc.underworld.no) [19:30] *** Mateon1 has quit IRC (se.hub irc.underworld.no) [19:30] *** wacky has quit IRC (se.hub irc.underworld.no) [19:30] *** SketchCow has quit IRC (se.hub irc.underworld.no) [19:30] *** fenn has quit IRC (se.hub irc.underworld.no) [19:30] *** Jens has quit IRC (se.hub irc.underworld.no) [19:30] *** PurpleSym has quit IRC (se.hub irc.underworld.no) [19:30] *** Jonimoose has quit IRC (se.hub irc.underworld.no) [19:30] *** tuluu has quit IRC (se.hub irc.underworld.no) [19:30] *** Kenshin has quit IRC (se.hub irc.underworld.no) [19:30] *** K4k has quit IRC (se.hub irc.underworld.no) [19:30] *** medowar has quit IRC (se.hub irc.underworld.no) [19:30] *** i0npulse has quit IRC (se.hub irc.underworld.no) [19:30] *** hook54321 has quit IRC (se.hub irc.underworld.no) [19:30] *** antomatic has quit IRC (se.hub irc.underworld.no) [19:30] *** Aoede has quit IRC (se.hub irc.underworld.no) [19:30] *** Rai-chan has quit IRC (se.hub irc.underworld.no) [19:30] *** decay__ has quit IRC (se.hub irc.underworld.no) [19:30] *** RKenshin has joined #archiveteam-bs [19:36] *** SketchCo1 has joined #archiveteam-bs [19:36] *** swebb sets mode: +o SketchCo1 [19:39] *** ppsym has joined #archiveteam-bs [19:45] *** Mateon1 has joined #archiveteam-bs [19:45] *** ppsym is now known as PurpleSym [19:45] *** RKenshin is now known as Kenshin [19:50] So I've been looking at the 500px WARCs to see what was grabbed. [19:51] Looking at this one (https://archive.org/details/archiveteam_500px_20180630213732), there were 401950 HTML files and 58762 images. [19:53] https://usercontent.irccloud-cdn.com/file/D4ejbjXx/20180630213732_image_urls.txt [19:55] *** fenn has joined #archiveteam-bs [19:55] *** Aoede has joined #archiveteam-bs [19:55] *** Rai-chan has joined #archiveteam-bs [19:55] These URLs do appear in the wayback machine. However, they represent around 5300~ Pictures, as each URL can be the same picture with different resolutions. [19:55] *** hook54321 has joined #archiveteam-bs [19:56] And there were 402000~ photo html pages grabbed (https://500px.com/photo/#) [19:57] So there are tons of photos that were not grabbed and are not on wayback. [19:58] *** Jens has joined #archiveteam-bs [19:58] Now, I've parsed those HTML pages to grab the JSON payload that's in them, which contains all of the information on the photo that would show on the website, including the links. And the images that were missed on still up on the 500px site. [19:59] *** i0npulse has joined #archiveteam-bs [20:00] However, doing a HEAD on the regular 500px url (https://500px.com/photo/99944723/rushing-by-jon-hawton) it doesn't seem like any of them have been taken down. [20:00] So [20:01] Somewhere during the project we went through all possible IDs for a short time. For these IDs the photos would only be archived if the webpage showed it is a CC photo. [20:01] Later on we stopped this when we had a complete list of CC photo webpage URLs, with which we continues and for which all photos have been archived. [20:01] continued* [20:01] Ahhh, okay. [20:02] So the HTML pages without archived photos would be pages that didn't contain a CC photo. [20:02] That's what I was gonna ask, because i was gonna say, that's what they were. [20:02] So I wasn't sure if that was intended. [20:03] Can we move this to #500pieces btw, the channel for 500px [20:04] Sure [20:35] *** phuzion has quit IRC (Read error: Operation timed out) [20:37] *** phuzion has joined #archiveteam-bs [20:44] *** Lord_Nigh has quit IRC (Read error: Operation timed out) [20:59] *** schbirid has quit IRC (Remote host closed the connection) [21:04] *** Lord_Nigh has joined #archiveteam-bs [21:41] *** BlueMax has joined #archiveteam-bs [22:04] I've started a manual wpull for the EU and US/AU Runes of Magic forums. Let's hope I don't get banned. [22:04] I'm only grabbing the forums and threads, no user profile pages or other stuff. [22:11] *** jschwart has quit IRC (Quit: Konversation terminated!) [22:13] *** Lord_Nigh has quit IRC (Read error: Operation timed out) [22:15] *** Lord_Nigh has joined #archiveteam-bs [22:18] *** Lord_Nigh has quit IRC (Read error: Operation timed out) [22:33] *** BlueMax has quit IRC (Leaving) [22:54] *** Lord_Nigh has joined #archiveteam-bs [23:20] *** godane has quit IRC (Read error: Operation timed out) [23:43] *** godane has joined #archiveteam-bs [23:43] *** godane has quit IRC (Client Quit) [23:45] *** godane has joined #archiveteam-bs [23:49] SketchCo1: i'm starting to upload vhs captures again [23:49] i don't know if you go my last message before pidgin disconnected [23:49] anyways just wait a week before uploading again