[00:04] *** benjins has joined #archiveteam-bs [00:58] OrIdow6: You asked for the Microsoft Download Center pages. Sorry, took a bit longer, but here they are: https://archive.org/details/microsoft_download_center_pages_20200804 [01:03] Oh wait, mixed up some files, more coming. [01:36] *** wyatt8750 has joined #archiveteam-bs [01:36] *** wyatt8740 has quit IRC (Read error: Operation timed out) [01:43] OrIdow6: https://archive.org/download/microsoft_download_center_pages_202008 [02:10] *** wyatt8750 has quit IRC (Read error: Operation timed out) [02:12] *** wyatt8740 has joined #archiveteam-bs [02:17] JAA: Thanks [03:07] *** lennier2 has joined #archiveteam-bs [03:08] *** qw3rty_ has joined #archiveteam-bs [03:09] *** lennier1 has quit IRC (Read error: Operation timed out) [03:09] *** lennier2 is now known as lennier1 [03:12] *** Ctrl has quit IRC (Read error: Operation timed out) [03:13] *** SJon_____ has quit IRC (Read error: Operation timed out) [03:14] *** SJon_____ has joined #archiveteam-bs [03:15] *** qw3rty__ has quit IRC (Read error: Operation timed out) [03:16] *** bsmith093 has quit IRC (Read error: Operation timed out) [03:18] *** bsmith093 has joined #archiveteam-bs [03:19] *** klg has joined #archiveteam-bs [03:20] *** jodizzle_ has joined #archiveteam-bs [03:27] *** klg_ has quit IRC (Read error: Operation timed out) [03:28] *** jodizzle has quit IRC (Read error: Operation timed out) [03:28] *** jodizzle_ is now known as jodizzle [04:33] *** Jonboy345 has quit IRC (Read error: Operation timed out) [04:35] *** mtntmnky_ has quit IRC (Remote host closed the connection) [04:35] *** mtntmnky_ has joined #archiveteam-bs [04:37] *** lennier2 has joined #archiveteam-bs [04:41] *** lennier1 has quit IRC (Ping timeout: 272 seconds) [04:41] *** lennier2 is now known as lennier1 [04:48] *** mtntmnky_ has quit IRC (Remote host closed the connection) [04:49] *** mtntmnky_ has joined #archiveteam-bs [06:27] *** Ctrl has joined #archiveteam-bs [06:51] *** HP_Archiv has joined #archiveteam-bs [06:51] *** britmob has quit IRC (Read error: Operation timed out) [06:52] *** britmob has joined #archiveteam-bs [06:52] *** britmob4 has joined #archiveteam-bs [06:55] *** betamax has quit IRC (Read error: Operation timed out) [06:55] *** britm0b has quit IRC (Read error: Operation timed out) [06:57] *** betamax has joined #archiveteam-bs [07:01] *** betamax has quit IRC (Read error: Operation timed out) [07:01] *** betamax has joined #archiveteam-bs [07:20] *** HP_Archiv has quit IRC (Read error: Connection reset by peer) [07:57] *** MaximeleG has joined #archiveteam-bs [08:50] *** BlueMax has quit IRC (Read error: Connection reset by peer) [09:02] *** wyatt8740 has quit IRC (Ping timeout: 260 seconds) [09:02] *** wyatt8740 has joined #archiveteam-bs [11:03] *** MaximeleG has quit IRC (Quit: MaximeleG) [11:07] *** MaximeleG has joined #archiveteam-bs [11:29] *** jshoard has joined #archiveteam-bs [12:16] *** dashcloud has quit IRC (Read error: Operation timed out) [14:03] *** britmob_ has joined #archiveteam-bs [14:09] *** britmob has quit IRC (Read error: Operation timed out) [14:09] *** britmob4 is now known as britmob [16:07] *** Arcorann has quit IRC (Read error: Connection reset by peer) [16:22] I am PRETTY sure this is the case already but we should probably make all new project channels go to hackint. [16:22] But explicitly mark so in announcements and wiki [16:50] *** dashcloud has joined #archiveteam-bs [17:00] *** britmob_ has quit IRC (Read error: Connection reset by peer) [17:05] *** britmob_ has joined #archiveteam-bs [17:06] *** Mateon1 has quit IRC (Quit: Mateon1) [17:06] *** Mateon1 has joined #archiveteam-bs [17:24] *** dashcloud has quit IRC (Read error: Operation timed out) [17:25] *** lennier2 has joined #archiveteam-bs [17:28] *** lennier1 has quit IRC (Ping timeout: 260 seconds) [17:28] *** lennier2 is now known as lennier1 [17:45] *** lunik1 has joined #archiveteam-bs [17:50] cm: Did you mean -k rather than -x on wget? [17:51] If -k doesn't handle the percent-encoding, that sounds like a bug to me. [17:53] It looks like it's supposed to convert ? to %3F per http://git.savannah.gnu.org/cgit/wget.git/tree/src/convert.c?id=314a4f42be3c969aadc1cef9f5859f8a61b7ca82#n722 [17:54] But only when --adjust-extension (-E) is also used. [18:01] the problem is how to store the files while maintain a mapping with the original url [18:02] Do you need that mapping, or do you just need internal links to work? [18:02] wget -x creates directories so that the output file path mirrors the url (idk how reliably) [18:03] i would like to keep the mapping so i know which url was requested [18:03] Well, then you need to do some web server magic. [18:03] so i can avoid requesting urls i already have [18:04] web server magic isn't an option since this is controlled by my web host [18:04] *** SJon_____ has quit IRC (Read error: Connection reset by peer) [18:04] Web server magic on serving the archive, I mean. [18:05] yeah [18:05] Ah right [18:05] Well, you can't have both. [18:05] *** SJon_____ has joined #archiveteam-bs [18:05] well i dont need the original url to be perfectly preserved within the archive url [18:06] i just need it to be stored somehow, so i can tell if i've already requested a given url [18:06] Right. wget's filename conversion with -k should be deterministic. [18:10] the man page doesn't fully explain what is done [18:10] i could read the code i guess [18:10] but i would need some way to convert the urls to what wget -k would produce, in order to check if i have gotten the URL already [18:11] *** MaximeleG has quit IRC (Quit: MaximeleG) [18:11] Yeah, that's in convert.c, but I don't know all the details. [18:12] Should be possible to write a little tool around that which just does the conversion. [18:12] *** brayden has quit IRC (Ping timeout: 272 seconds) [18:12] *** Aoede has quit IRC (Ping timeout: 272 seconds) [18:12] *** Laverne has quit IRC (Ping timeout: 272 seconds) [18:12] *** sHATNER has quit IRC (Ping timeout: 272 seconds) [18:13] *** Aoede has joined #archiveteam-bs [18:13] well i probably wont be using wget -k anyway [18:13] im archiving podcasts and i don't think wget parses rss feeds [18:14] so i could use my own implementation of the wget -k conversion algo, or just use something like hex encoding for the filenames [18:43] *** TC01 has quit IRC (Read error: Operation timed out) [18:46] *** TC01 has joined #archiveteam-bs [18:51] *** scorche` has joined #archiveteam-bs [18:52] *** scorche has quit IRC (Read error: Operation timed out) [18:52] *** scorche` is now known as scorche [19:16] *** sHATNER has joined #archiveteam-bs [19:17] *** brayden has joined #archiveteam-bs [19:17] *** Laverne has joined #archiveteam-bs [20:34] *** mtntmnky_ has quit IRC (Remote host closed the connection) [20:34] *** mtntmnky_ has joined #archiveteam-bs [20:42] *** dashcloud has joined #archiveteam-bs [21:11] *** lennier1 has quit IRC (Ping timeout: 265 seconds) [21:13] *** lennier1 has joined #archiveteam-bs [21:16] *** lennier2 has joined #archiveteam-bs [21:18] *** BlueMax has joined #archiveteam-bs [21:21] *** lennier1 has quit IRC (Read error: Operation timed out) [21:21] *** lennier2 is now known as lennier1 [21:29] *** jshoard has quit IRC (Quit: Leaving) [21:41] *** BlueMax has quit IRC (Quit: Leaving) [23:06] *** larryv has joined #archiveteam-bs [23:07] *** Arcorann has joined #archiveteam-bs [23:25] *** Arcorann has quit IRC (Remote host closed the connection) [23:26] *** Arcorann has joined #archiveteam-bs [23:50] *** HP_Archiv has joined #archiveteam-bs [23:51] *** atbk has quit IRC (Quit: ZNC - https://znc.in) [23:54] *** atbk has joined #archiveteam-bs