#archiveteam-bs 2020-08-05,Wed

↑back Search

Time Nickname Message
00:12 πŸ”— mgrandi But yeah @JAA , some of those files that get updated are updated on a regular basis, one example I found via Twitter are the azure IP ranges https://www.microsoft.com/en-us/download/details.aspx?id=56519
00:12 πŸ”— mgrandi I assume the file that is associated with that is updated every so often
00:13 πŸ”— JAA mgrandi: Yep, I found a bunch of files that appear to be updated weekly.
00:15 πŸ”— JAA Sucks that they don't keep the old files around.
00:38 πŸ”— step has quit IRC (Remote host closed the connection)
01:14 πŸ”— BlueMax has joined #archiveteam-bs
01:33 πŸ”— synm0nger has quit IRC (Read error: Connection reset by peer)
01:33 πŸ”— SynMonger has joined #archiveteam-bs
02:33 πŸ”— Clefable has quit IRC (Quit: ZNC: the superior metal to CBLT)
03:29 πŸ”— Wingy has quit IRC (Read error: Operation timed out)
03:36 πŸ”— qw3rty__ has joined #archiveteam-bs
03:39 πŸ”— Wingy has joined #archiveteam-bs
03:44 πŸ”— qw3rty_ has quit IRC (Read error: Operation timed out)
04:02 πŸ”— Wingy has quit IRC (Read error: Operation timed out)
04:41 πŸ”— bsmith093 has quit IRC (Read error: Operation timed out)
04:44 πŸ”— bsmith093 has joined #archiveteam-bs
04:56 πŸ”— bsmith093 has quit IRC (Ping timeout: 745 seconds)
05:00 πŸ”— Wingy has joined #archiveteam-bs
05:03 πŸ”— bsmith093 has joined #archiveteam-bs
05:51 πŸ”— atphoenix has quit IRC (Ping timeout: 265 seconds)
08:26 πŸ”— fuzzy802 has joined #archiveteam-bs
08:32 πŸ”— fuzzy8021 has quit IRC (Read error: Operation timed out)
08:36 πŸ”— atphoenix has joined #archiveteam-bs
08:36 πŸ”— fuzzy802 is now known as fuzzy8021
09:01 πŸ”— legoktm has quit IRC (Ping timeout: 610 seconds)
09:04 πŸ”— legoktm has joined #archiveteam-bs
10:37 πŸ”— OrIdow6^2 has joined #archiveteam-bs
10:39 πŸ”— OrIdow6 has quit IRC (Ping timeout: 265 seconds)
11:22 πŸ”— HP_Archiv has joined #archiveteam-bs
11:33 πŸ”— OrIdow6^2 is now known as OrIdow6
11:56 πŸ”— HP_Archiv has quit IRC (Quit: Leaving)
12:01 πŸ”— HP_Archiv has joined #archiveteam-bs
12:08 πŸ”— HP_Archiv has quit IRC (Quit: Leaving)
12:42 πŸ”— coderobe has quit IRC (Quit: Ping timeout (120 seconds))
12:42 πŸ”— coderobe has joined #archiveteam-bs
14:30 πŸ”— BlueMax has quit IRC (Quit: Leaving)
14:31 πŸ”— JAA Gearogs, Discogs's audio equipment database, is shutting down on 2020-08-31. They will upload a data dump to IA themselves.
14:31 πŸ”— JAA https://support.discogslabs.com/hc/en-us/articles/360011681538-Gearogs-Closing-On-August-31-2020
14:34 πŸ”— JAA I'll take this as an opportunity to archive all of their dumps.
14:34 πŸ”— JAA They'll apparently only upload the last one anyway.
14:38 πŸ”— JAA All of the Discogs data archives are a bit under 500 GB. All of the other *ogs dumps are just over 2 GB. Cute.
14:40 πŸ”— JAA While I'm at it, I'll also attempt to make https://data.discogs.com/ and http://data.discogslabs.com/ work in the WBM.
14:42 πŸ”— arkiver JAA: yes :D
14:42 πŸ”— arkiver I agree
14:43 πŸ”— JAA :-)
14:44 πŸ”— JAA The Discogs dumps are already on IA in items, but there's virtually nothing in the WBM as far as I can see.
14:45 πŸ”— arkiver yeah
14:45 πŸ”— JAA Couldn't find the other *ogs dumps on IA.
14:45 πŸ”— arkiver and while dumps are better for handling the data
14:45 πŸ”— arkiver wayback machine makes it easier findable for people without technical skills
14:45 πŸ”— JAA Yep
14:45 πŸ”— arkiver which is like 99.9%
14:45 πŸ”— JAA Well, the stuff I'm grabbing is still the dumps though.
14:46 πŸ”— arkiver those are probably best to get to a safe location first
14:46 πŸ”— JAA But there are links to those two pages I mentioned across the web, so if someone tries to find a file from there, they might only try the WBM, not the IA search (where you can't even search for filenames etc.).
14:46 πŸ”— arkiver are you saving the dumps in wayback machine as well?
14:47 πŸ”— JAA Yeah
14:47 πŸ”— arkiver also yes to what you said
14:47 πŸ”— JAA Basically, the idea is to get https://data.discogs.com/ fully working in the WBM.
14:48 πŸ”— arkiver doesn't seem too extremely big
14:48 πŸ”— arkiver can we put that in AB?
14:48 πŸ”— JAA Yep
14:48 πŸ”— JAA That's my plan.
14:50 πŸ”— JAA Assuming the WBM rewrites URLs that are JS strings like '//example.org', it should all work fine.
14:51 πŸ”— JAA And that does appear to be the case.
14:51 πŸ”— JAA Although the browsing doesn't work because it doesn't serve the S3 XHR data verbatim.
14:51 πŸ”— JAA See e.g. https://web.archive.org/web/20200420234832/https://data.discogs.com/ https://web.archive.org/web/20190418084744/http://discogs-data.s3-us-west-2.amazonaws.com/?delimiter=/&prefix=data/
14:52 πŸ”— JAA Inserts the WBM scripts etc. into the XML, which breaks parsing.
15:07 πŸ”— JAA Regarding archiving Gearogs the website itself: proper archival with previous revisions etc. requires logging in, as on all Discogs sites.
15:10 πŸ”— JAA The current revisions should probably work fine through AB, so I'll try that.
15:14 πŸ”— JAA arkiver: Could you forward the above to Mark or someone else from the WBM team? The WBM scripts shouldn't be inserted into XML data, only HTML/XHTML.
15:18 πŸ”— JAA Looks like AB doesn't pick up the full-size images. :-/
15:21 πŸ”— JAA Oof
15:21 πŸ”— JAA They also shut down Comicogs. It's already gone.
15:21 πŸ”— JAA Shut down on 31 July.
15:22 πŸ”— JAA "We will also be closing Gearogs, Filmogs, Bookogs, and Posterogs, but those will be closed about one month later while we make sure we haven’t overlooked anything. VinylHub will remain open."
15:30 πŸ”— step has joined #archiveteam-bs
16:16 πŸ”— britmob has quit IRC (Ping timeout: 265 seconds)
16:26 πŸ”— Arcorann has quit IRC (Read error: Connection reset by peer)
17:05 πŸ”— Wingy has quit IRC (The Lounge - https://thelounge.chat)
17:11 πŸ”— Wingy has joined #archiveteam-bs
17:35 πŸ”— asdf0101 has quit IRC (Remote host closed the connection)
17:45 πŸ”— asdf0101 has joined #archiveteam-bs
18:13 πŸ”— britmob has joined #archiveteam-bs
18:25 πŸ”— britmob has quit IRC (Read error: Connection reset by peer)
18:32 πŸ”— britmob has joined #archiveteam-bs
18:50 πŸ”— ave_ has joined #archiveteam-bs
18:58 πŸ”— VerifiedJ has joined #archiveteam-bs
19:09 πŸ”— lunik14 has joined #archiveteam-bs
19:10 πŸ”— lunik1 has quit IRC (Ping timeout: 265 seconds)
19:10 πŸ”— lunik14 is now known as lunik1
19:22 πŸ”— DLoader_ has joined #archiveteam-bs
19:31 πŸ”— DLoader has quit IRC (Ping timeout: 745 seconds)
19:32 πŸ”— DLoader_ is now known as DLoader
19:39 πŸ”— VerifiedJ has quit IRC (Quit: Leaving)
19:54 πŸ”— Clefable has joined #archiveteam-bs
22:50 πŸ”— Arcorann has joined #archiveteam-bs
22:51 πŸ”— Arcorann has quit IRC (Remote host closed the connection)
22:52 πŸ”— Arcorann has joined #archiveteam-bs
23:39 πŸ”— BlueMax has joined #archiveteam-bs

irclogger-viewer