#archiveteam-bs 2018-07-12,Thu

↑back Search

Time Nickname Message
00:29 🔗 BlueMax has joined #archiveteam-bs
01:35 🔗 Muad-Dib has quit IRC (Ping timeout: 260 seconds)
01:41 🔗 Muad-Dib has joined #archiveteam-bs
01:42 🔗 svchfoo1 sets mode: +o Muad-Dib
02:02 🔗 Flashfire has quit IRC (Quit: Connection closed for inactivity)
03:01 🔗 Flashfire has joined #archiveteam-bs
03:18 🔗 wp494 has quit IRC (Ping timeout: 255 seconds)
03:18 🔗 wp494 has joined #archiveteam-bs
03:28 🔗 DFJustin has quit IRC (Quit: IMHOSTFU)
03:29 🔗 Arrhenius has joined #archiveteam-bs
03:29 🔗 Arrhenius evening
03:33 🔗 Arrhenius is there any word on the Tindeck progress?
03:33 🔗 DFJustin has joined #archiveteam-bs
03:33 🔗 swebb sets mode: +o DFJustin
03:52 🔗 odemg has quit IRC (Ping timeout: 268 seconds)
03:54 🔗 Arrhenius has quit IRC (Ping timeout: 633 seconds)
04:03 🔗 odemg has joined #archiveteam-bs
04:56 🔗 Mateon1 has quit IRC (Ping timeout: 252 seconds)
04:56 🔗 Mateon1 has joined #archiveteam-bs
05:11 🔗 RichardG has quit IRC (Read error: Connection reset by peer)
05:12 🔗 RichardG has joined #archiveteam-bs
06:55 🔗 godane latest digitize tapes: https://www.patreon.com/posts/digitize-tapes-20019351
07:04 🔗 godane SketchCow: i may have hit gold
07:04 🔗 godane there is a episode of Computer Chronicles talking about copyright law from 1989
07:04 🔗 godane IA has the episode from 1985
07:05 🔗 godane so this is a different episode that i think is from june 1990
07:07 🔗 kiska has quit IRC (Read error: Operation timed out)
07:08 🔗 godane so based on one of the guys that re-encoding of season 7 of computer chronicles and tvdb
07:08 🔗 godane it was S07E10 and it was not available
07:26 🔗 Atom-- has joined #archiveteam-bs
07:29 🔗 ta9le has joined #archiveteam-bs
07:30 🔗 Atom has quit IRC (Read error: Operation timed out)
07:55 🔗 Flashfir_ has joined #archiveteam-bs
07:55 🔗 Flashfir_ has quit IRC (Client Quit)
07:56 🔗 Flashfire has quit IRC ()
07:56 🔗 Flashfire has joined #archiveteam-bs
08:03 🔗 kiska has joined #archiveteam-bs
08:10 🔗 Stiletto has joined #archiveteam-bs
08:10 🔗 Flashfire has quit IRC (Quit: Bye)
08:12 🔗 Flashfire has joined #archiveteam-bs
08:12 🔗 Stilett0 has quit IRC (Read error: Operation timed out)
08:34 🔗 ReimuHaku has quit IRC (Quit: http://quassel-irc.org - Chat comfortably. Anywhere.)
08:46 🔗 ReimuHaku has joined #archiveteam-bs
09:10 🔗 Dimtree has quit IRC (Read error: Operation timed out)
09:33 🔗 Dimtree has joined #archiveteam-bs
10:00 🔗 Sk1d has joined #archiveteam-bs
10:02 🔗 BlueMax has quit IRC (Leaving)
10:29 🔗 SilSte has quit IRC (Read error: Operation timed out)
10:31 🔗 SilSte has joined #archiveteam-bs
10:33 🔗 ta9le Holy hell do I have a lot of coverdiscs and warez discs I'd like to upload
10:33 🔗 ta9le including a game that's literally nowhere to be found on the internets, as far as I'm aware
10:36 🔗 eientei95 What game?
10:37 🔗 kiska I am going to ask on Future of Forums discord if there are any more icyboards forum urls for us to ingest into archivebot
10:38 🔗 kiska Unless someone else has done so
10:40 🔗 ta9le eientei95, Kotobuki Panel DE Multi
10:40 🔗 ta9le Google only spews out, like, three mentions of that game, as far as I'm aware
10:40 🔗 ta9le I'll upload it on IA soon-ish, I just need to know whether the executable works with Windows 3.1
10:44 🔗 Sk1d has quit IRC (Ping timeout: 260 seconds)
10:46 🔗 Flashfire Just upload it lol
10:47 🔗 Flashfire I will find a work around
10:48 🔗 ta9le Except I'm busy uploading the compilation as of now
10:48 🔗 Flashfire Ok
10:48 🔗 ta9le it's 2 CDs, so it's gonna take a looooooong while
10:48 🔗 ta9le plus I'd need to add a bunch of scans soon enough
10:49 🔗 Sk1d has joined #archiveteam-bs
10:56 🔗 Flashfire has quit IRC (Quit: Bye)
10:58 🔗 ta9le ...okay, so it's just Win95 minimum
10:58 🔗 ta9le coast clear
11:35 🔗 JAA So the Runes of Magic forums are shutting down as someone mentioned a few days ago. It turns out there are several of these. So far, I've come across http://board.eu.runesofmagic.gameforge.com/, http://board.pl.runesofmagic.gameforge.com/, and http://board.us.runesofmagic.gameforge.com/. Can someone check if there are more of these please?
11:41 🔗 ta9le https://archive.org/details/PanelDeMulti
11:44 🔗 ta9le Oh yeah, and I also have a bunch of cover discs from UK magazines, with a few DVDs and a few audio CDs
11:57 🔗 godane ta9ie: what do you have for cover discs?
12:00 🔗 ta9le They're mostly Russian and Ukrainian, but I've also got a few from The Sun, The Star and The Mirror newspapers
12:00 🔗 ta9le one Czech coverdisc, too
12:01 🔗 ta9le also, a couple of coverdiscs from a special edition of the magazine that otherwise has full coverage on IA
12:05 🔗 ta9le so I'm pretty sure I've got *some* exclusive stuff to share
12:10 🔗 Aoede JAA: unlikely that there are any more boards. the new forum lists EN, DE, FR, PL and ES
12:17 🔗 jut has quit IRC (Leaving)
12:19 🔗 jut has joined #archiveteam-bs
12:26 🔗 eientei95 JAA: There's http://board.it.runesofmagic.gameforge.com/ which is just a redirect
12:28 🔗 Aoede ...and the eu board crashed
12:29 🔗 eientei95 Yup
12:29 🔗 JAA Interesting, the ArchiveBot job is still getting 200s.
12:30 🔗 JAA Yeah, I came across the Italian link somewhere as well I think.
12:30 🔗 eientei95 Yeah, because the IA job has a valid session in their MSQL db it seems
12:30 🔗 eientei95 ANy new connections fail
12:30 🔗 JAA au and de subdomains also exist, but they don't have a board subdomain.
12:31 🔗 JAA Ah yeah, "The table 'wcf1_session' is full"
12:31 🔗 Aoede au is included in us forum, de in eu forum
12:31 🔗 JAA Yep
12:32 🔗 JAA I wonder if the ArchiveBot job filled the session table.
12:33 🔗 eientei95 JAA: Checked both current ISO 3166-1 alpha-2 codes as well as ISO 639-1 codes and no new board links
12:33 🔗 JAA eientei95: Sweet, thanks.
12:33 🔗 eientei95 np
12:33 🔗 ta9le Aite, I'll fill it in a bit later https://archive.org/details/Legan400Classics\
12:33 🔗 eientei95 Found out that Network Manager added an invalid IPv6 entry into resolv.conf so that dig commands don't work >.>
13:03 🔗 ta9le ...oh, okay, I've just realized I put my uploads in Community Texts and apparently I can't change the category at all
13:04 🔗 ta9le what do
13:04 🔗 JAA ta9le: Email Jason or info@archive.org to have it moved to the right collection.
13:04 🔗 ta9le Oof, alright
13:04 🔗 ta9le thanks
13:07 🔗 JAA Yeah, it's a known bug/missing feature that uploaders can move items between collections they have upload rights to.
13:45 🔗 robogoat_ has quit IRC (Read error: Operation timed out)
13:47 🔗 robogoat has joined #archiveteam-bs
14:33 🔗 Silasqwer has joined #archiveteam-bs
14:38 🔗 Silasqwer has quit IRC (Quit: http://chat.efnet.org (Ping timeout))
14:49 🔗 godane so the current tape i'm doing is another 8 hour tape
14:50 🔗 godane i think from 1992-01 cause there was ad for hot shots coming out jan 30th
14:50 🔗 godane since hot shots came out in july 1991 i figure home release came out in jan 1992
15:29 🔗 sep332 has joined #archiveteam-bs
16:07 🔗 djsundog has joined #archiveteam-bs
16:30 🔗 JAA The ArchiveBot jobs for the EU and US/AU Runes of Magic forums seem to have been banned.
16:31 🔗 JAA Presumably related to the session issue we saw four hours ago.
16:33 🔗 SilSte has quit IRC (Read error: Operation timed out)
16:34 🔗 SilSte has joined #archiveteam-bs
16:55 🔗 kiska JAA so are all the pipelines banned?
16:56 🔗 JAA kiska: Haven't checked.
16:57 🔗 kiska We could juggle pipelines until they are all banned
16:57 🔗 JAA Well, restarting the job over and over isn't going to get us much additional content though.
16:58 🔗 JAA We can't migrate a job between pipelines.
16:58 🔗 kiska Hrm interesting
16:58 🔗 jut make it a warior job?
16:58 🔗 JAA That would probably work for the EU forums, but not for the US/AU ones.
16:59 🔗 JAA The former uses simple canonical URLs index.php?page=Thread&threadID=1 etc.
16:59 🔗 JAA The US/AU forums (and the PL ones, which we grabbed entirely already) uses some /board1-subforum/board3-subsubforum/1234-threadslug/ URL which isn't really predictable.
17:00 🔗 JAA Also, we'll likely trigger the same issues again if we use the warrior. Their forum software just sucks.
17:00 🔗 kiska We have <24 hrs I presume before they disappear
17:00 🔗 JAA Yeah
17:01 🔗 kiska We can always run a discovery for a few hours before switching to grabbing them?
17:02 🔗 kiska I just had a look and oh my........
17:22 🔗 schbirid has joined #archiveteam-bs
17:28 🔗 Igloo has quit IRC (west.us.hub irc.Prison.NET)
17:28 🔗 achip has quit IRC (west.us.hub irc.Prison.NET)
17:30 🔗 Igloo_ has joined #archiveteam-bs
17:59 🔗 achip has joined #archiveteam-bs
18:28 🔗 jschwart has joined #archiveteam-bs
19:00 🔗 jdude104 has joined #archiveteam-bs
19:05 🔗 jdude104 has quit IRC (Client Quit)
19:24 🔗 K4k_ has joined #archiveteam-bs
19:25 🔗 JonimusP has joined #archiveteam-bs
19:25 🔗 swebb sets mode: +o JonimusP
19:27 🔗 antomati_ has joined #archiveteam-bs
19:27 🔗 swebb sets mode: +o antomati_
19:27 🔗 wacky_ has joined #archiveteam-bs
19:28 🔗 decay_ has joined #archiveteam-bs
19:28 🔗 tuluu_ has joined #archiveteam-bs
19:29 🔗 espes___ has joined #archiveteam-bs
19:29 🔗 espes__ has quit IRC (Ping timeout: 252 seconds)
19:30 🔗 RichardG_ has joined #archiveteam-bs
19:30 🔗 RichardG has quit IRC (se.hub irc.underworld.no)
19:30 🔗 Mateon1 has quit IRC (se.hub irc.underworld.no)
19:30 🔗 wacky has quit IRC (se.hub irc.underworld.no)
19:30 🔗 SketchCow has quit IRC (se.hub irc.underworld.no)
19:30 🔗 fenn has quit IRC (se.hub irc.underworld.no)
19:30 🔗 Jens has quit IRC (se.hub irc.underworld.no)
19:30 🔗 PurpleSym has quit IRC (se.hub irc.underworld.no)
19:30 🔗 Jonimoose has quit IRC (se.hub irc.underworld.no)
19:30 🔗 tuluu has quit IRC (se.hub irc.underworld.no)
19:30 🔗 Kenshin has quit IRC (se.hub irc.underworld.no)
19:30 🔗 K4k has quit IRC (se.hub irc.underworld.no)
19:30 🔗 medowar has quit IRC (se.hub irc.underworld.no)
19:30 🔗 i0npulse has quit IRC (se.hub irc.underworld.no)
19:30 🔗 hook54321 has quit IRC (se.hub irc.underworld.no)
19:30 🔗 antomatic has quit IRC (se.hub irc.underworld.no)
19:30 🔗 Aoede has quit IRC (se.hub irc.underworld.no)
19:30 🔗 Rai-chan has quit IRC (se.hub irc.underworld.no)
19:30 🔗 decay__ has quit IRC (se.hub irc.underworld.no)
19:30 🔗 RKenshin has joined #archiveteam-bs
19:36 🔗 SketchCo1 has joined #archiveteam-bs
19:36 🔗 swebb sets mode: +o SketchCo1
19:39 🔗 ppsym has joined #archiveteam-bs
19:45 🔗 Mateon1 has joined #archiveteam-bs
19:45 🔗 ppsym is now known as PurpleSym
19:45 🔗 RKenshin is now known as Kenshin
19:50 🔗 DrasticAc So I've been looking at the 500px WARCs to see what was grabbed.
19:51 🔗 DrasticAc Looking at this one (https://archive.org/details/archiveteam_500px_20180630213732), there were 401950 HTML files and 58762 images.
19:53 🔗 DrasticAc https://usercontent.irccloud-cdn.com/file/D4ejbjXx/20180630213732_image_urls.txt
19:55 🔗 fenn has joined #archiveteam-bs
19:55 🔗 Aoede has joined #archiveteam-bs
19:55 🔗 Rai-chan has joined #archiveteam-bs
19:55 🔗 DrasticAc These URLs do appear in the wayback machine. However, they represent around 5300~ Pictures, as each URL can be the same picture with different resolutions.
19:55 🔗 hook54321 has joined #archiveteam-bs
19:56 🔗 DrasticAc And there were 402000~ photo html pages grabbed (https://500px.com/photo/#)
19:57 🔗 DrasticAc So there are tons of photos that were not grabbed and are not on wayback.
19:58 🔗 Jens has joined #archiveteam-bs
19:58 🔗 DrasticAc Now, I've parsed those HTML pages to grab the JSON payload that's in them, which contains all of the information on the photo that would show on the website, including the links. And the images that were missed on still up on the 500px site.
19:59 🔗 i0npulse has joined #archiveteam-bs
20:00 🔗 DrasticAc However, doing a HEAD on the regular 500px url (https://500px.com/photo/99944723/rushing-by-jon-hawton) it doesn't seem like any of them have been taken down.
20:00 🔗 arkiver So
20:01 🔗 arkiver Somewhere during the project we went through all possible IDs for a short time. For these IDs the photos would only be archived if the webpage showed it is a CC photo.
20:01 🔗 arkiver Later on we stopped this when we had a complete list of CC photo webpage URLs, with which we continues and for which all photos have been archived.
20:01 🔗 arkiver continued*
20:01 🔗 DrasticAc Ahhh, okay.
20:02 🔗 arkiver So the HTML pages without archived photos would be pages that didn't contain a CC photo.
20:02 🔗 DrasticAc That's what I was gonna ask, because i was gonna say, that's what they were.
20:02 🔗 DrasticAc So I wasn't sure if that was intended.
20:03 🔗 arkiver Can we move this to #500pieces btw, the channel for 500px
20:04 🔗 DrasticAc Sure
20:35 🔗 phuzion has quit IRC (Read error: Operation timed out)
20:37 🔗 phuzion has joined #archiveteam-bs
20:44 🔗 Lord_Nigh has quit IRC (Read error: Operation timed out)
20:59 🔗 schbirid has quit IRC (Remote host closed the connection)
21:04 🔗 Lord_Nigh has joined #archiveteam-bs
21:41 🔗 BlueMax has joined #archiveteam-bs
22:04 🔗 JAA I've started a manual wpull for the EU and US/AU Runes of Magic forums. Let's hope I don't get banned.
22:04 🔗 JAA I'm only grabbing the forums and threads, no user profile pages or other stuff.
22:11 🔗 jschwart has quit IRC (Quit: Konversation terminated!)
22:13 🔗 Lord_Nigh has quit IRC (Read error: Operation timed out)
22:15 🔗 Lord_Nigh has joined #archiveteam-bs
22:18 🔗 Lord_Nigh has quit IRC (Read error: Operation timed out)
22:33 🔗 BlueMax has quit IRC (Leaving)
22:54 🔗 Lord_Nigh has joined #archiveteam-bs
23:20 🔗 godane has quit IRC (Read error: Operation timed out)
23:43 🔗 godane has joined #archiveteam-bs
23:43 🔗 godane has quit IRC (Client Quit)
23:45 🔗 godane has joined #archiveteam-bs
23:49 🔗 godane SketchCo1: i'm starting to upload vhs captures again
23:49 🔗 godane i don't know if you go my last message before pidgin disconnected
23:49 🔗 godane anyways just wait a week before uploading again

irclogger-viewer