[00:23] *** JesseW has joined #archiveteam-bs [00:25] *** Mayonaise has quit IRC (Read error: Connection reset by peer) [00:27] *** Mayonaise has joined #archiveteam-bs [00:34] *** turnkit has joined #archiveteam-bs [00:47] *** turnkit has quit IRC (Ping timeout: 633 seconds) [00:55] *** BlueMaxim has joined #archiveteam-bs [00:55] *** BlueMaxim has quit IRC (Client Quit) [00:57] *** BlueMaxim has joined #archiveteam-bs [01:04] *** bsmith093 has joined #archiveteam-bs [01:05] *** bsmith093 has quit IRC (Client Quit) [01:10] *** bsmith093 has joined #archiveteam-bs [01:10] *** dashcloud has quit IRC (Ping timeout: 260 seconds) [01:10] *** dashcloud has joined #archiveteam-bs [01:13] *** RichardG_ has joined #archiveteam-bs [01:14] *** kristian_ has quit IRC (Quit: Leaving) [01:15] *** RichardG has quit IRC (Ping timeout: 370 seconds) [01:17] *** bsmith093 has quit IRC (Quit: Leaving.) [01:18] *** turnkit has joined #archiveteam-bs [01:27] *** fie__ has quit IRC (Read error: Operation timed out) [01:35] *** fie has joined #archiveteam-bs [01:38] hey @sketchcow - any chance you can get me into an alpha of the Next Gen Arnold Foundation funded search functionality for wayback machine? I know about https://wayback-beta.archive.org/ but it doesn't seem to search home page content (that search functionality was described in the archive.org blog) -- e.g. phrase searching [01:38] ref: https://blog.archive.org/2011/06/15/http-archive-joins-with-internet-archive/ [01:40] HEY SO HEARD FAMAS STOPPED BY [01:40] THIS ONE THINKS THAT'S HILARIOUS [01:40] Beta is the search you're looking for [01:40] okay. thanks. I'll leave feedback there. [01:42] last I heard it only indexes the front pages of domains [01:42] that's what the blog said -- only the front / home page [01:43] but I want to search for a phrase like "mad dog" -- it doesn't seem to work [01:43] guess that's why it's beta. I went ahead and left a simple note. [01:43] Well, all I can tell you is that it's a beta. [01:43] It's gonna be a great feature [01:43] Hey, this beta lacks functionality [01:43] to be able to search the wayback machine [01:43] too bad only front page though [01:44] but even the front page search function -- it will open up a lot of otherwise "lost" pages [01:44] so it's really a great thing [01:49] i'm uploading 1997 nasa docs [01:49] only 2872 pdfs [01:49] about 9.3gb [01:49] also i'm at 922k items [02:18] *** RichardG_ has quit IRC (Read error: Operation timed out) [02:18] *** RichardG has joined #archiveteam-bs [02:45] *** RichardG has quit IRC (Read error: Operation timed out) [02:45] *** RichardG has joined #archiveteam-bs [02:59] *** aschmitz has quit IRC (Remote host closed the connection) [03:35] *** pikhq has joined #archiveteam-bs [03:46] *** pikhq_ has quit IRC (Read error: Operation timed out) [03:46] *** Jordan_ has joined #archiveteam-bs [03:49] *** Jordan- has quit IRC (Ping timeout: 250 seconds) [04:11] *** RichardG has quit IRC (Read error: Operation timed out) [04:11] *** RichardG has joined #archiveteam-bs [04:29] *** aschmitz has joined #archiveteam-bs [04:35] *** pikhq has quit IRC (Ping timeout: 255 seconds) [04:35] *** Sk1d has quit IRC (Ping timeout: 194 seconds) [04:37] *** pikhq has joined #archiveteam-bs [04:41] *** Sk1d has joined #archiveteam-bs [04:47] Just saw this on HN: the c2 wiki is offline [04:47] Lukcily someone stuffed it into ArchiveBot in July: http://archive.fart.website/archivebot/viewer/job/xdufx [04:48] Huh, apparently the server had a disk crash: https://twitter.com/WardCunningham/status/785960076041302016 [04:48] MrRadar: It was also in the Archive Team mass wiki grabs at some point. [04:56] * JesseW is somewhat concerned about better searching of Wayback data leading to increased demands for removal [04:56] I've had this thought too [04:59] I wish there were more user-friendly tools for making private copies of arbitrary web pages. There are lots of tools (wget being central among them), but none (that i know of yet) have the ease I'm hoping for. [05:00] *** bsmith093 has joined #archiveteam-bs [05:03] *** robink has quit IRC (Read error: Operation timed out) [05:16] *** robink has joined #archiveteam-bs [05:28] *** JesseW has quit IRC (Ping timeout: 370 seconds) [05:29] *** dashcloud has quit IRC (Read error: Operation timed out) [05:32] *** dashcloud has joined #archiveteam-bs [05:44] *** brayden has quit IRC (Quit: Leaving) [05:51] *** brayden has joined #archiveteam-bs [05:51] *** swebb sets mode: +o brayden [05:52] *** turnkit has quit IRC (turnkit) [05:55] *** pikhq has quit IRC (Ping timeout: 255 seconds) [05:57] *** pikhq has joined #archiveteam-bs [06:19] *** Aranje has quit IRC (Quit: Three sheets to the wind) [06:37] *** brayden has quit IRC (Quit: Leaving) [07:05] *** RichardG has quit IRC (Read error: Operation timed out) [07:05] *** RichardG has joined #archiveteam-bs [07:27] *** brayden has joined #archiveteam-bs [07:27] *** swebb sets mode: +o brayden [07:29] so i found this to be interesting [07:30] looks like Sports Illustrated as a vault section of there site [07:30] they go back to 1954 [07:31] and also put issues up to 2016-09-26 [07:31] very odd [07:46] *** pikhq_ has joined #archiveteam-bs [07:49] *** pikhq has quit IRC (Read error: Operation timed out) [08:09] *** schbirid has joined #archiveteam-bs [08:28] *** GE has joined #archiveteam-bs [08:38] *** Boppen has quit IRC (Ping timeout: 194 seconds) [08:44] *** Boppen has joined #archiveteam-bs [09:42] *** RichardG has quit IRC (Read error: Operation timed out) [09:42] *** RichardG has joined #archiveteam-bs [09:56] *** GE has quit IRC (Remote host closed the connection) [09:57] * arkiver is still not convinced wayback machine search will be a good thing for the web archive [10:08] *** Shakespea has joined #archiveteam-bs [10:08] Hi [10:09] Anyone in your group archive old SDK? [10:09] I was trying to find a version of the MS Platfrom SDk that still supported XP ( legacy App issue) [10:10] Also - https://social.msdn.microsoft.com/Forums/windowsdesktop/en-US/e1147034-9b0b-4494-a5bc-6dfebb6b7eb1/download-and-install-microsoft-platform-sdk-febuary-2003-last-version-with-vc6-support?forum=windowssdk [10:10] I wasn't sure if the links here were archived. [10:10] (And given that this is the last version that supports anything really old (i.e VC6) , It was felt to be important.) [10:11] Archive.org wayback doesn't necessarily have a full MSDN [10:11] for earlier contents [10:13] In any event I would strongly suggest a project ( I can't do it on the Wiki because I am currently blocked) to archive SDK, even if for various reasons it has to be a dark archive. [10:13] arkiver: Good morning [10:14] hi [10:19] Sorry for the monolouge [10:20] I was also trying to track down an ISO for the most recent VS Express that still supported XP [10:20] Naturally people should be using the most recent compilers but the concern from an archival perspective was ancient legacy systems built on legacy tools years ago [10:21] This isn't as much of an issue on GNU/Linux project as these tend to keep old package releases arounf for longer [10:23] Shakespea: wasn';t the latest Express version for XP the 2010 one? [10:23] if so, https://go.microsoft.com/?linkid=9709969 [10:23] Okay, and is that download archived for the future? [10:23] no idea [10:24] yeah, seems 2010 was latest for XP [10:24] 2013 requires 7 [10:24] (but is available at https://www.microsoft.com/en-us/download/details.aspx?id=44914 ) [10:25] And go.microsoft.com is NOT in Wayback [10:25] (due to robots.txt) [10:25] < fume> makes strange cursing sounds... [10:27] joepie: It's available for now, but it's now getting on 6 years [10:28] Shakespea: right, so go collect them and upload them to IA? :P [10:28] joepie91: I will bear that in mind [10:28] Your second link is archived already - https://web.archive.org/web/20161014102607/https://www.microsoft.com/en-us/download/details.aspx?id=44914 [10:29] because I just saved it [10:29] the download may not necessarily be [10:29] and these things are better stored as separate items on IA anyway [10:29] That's a problem as I'm not sure how to get it saved to IA automatically [10:29] Shakespea: create an account, hit "upload [10:29] "* [10:29] My local bandwidth is such that I can't download many ISO's [10:30] right. have a server or something somewhere? [10:30] I am just a normal net user, I don't have VPN or remote servers... [10:30] Something else I'll have to look into [10:31] I agree with you in principle :) [10:31] Shakespea: fwiw, you can always just upload it to IA slowly [10:31] which is better than not at all [10:31] :) [10:31] jeopie91: True... It would be nice if there was a way to get SDK's and downloads archived without needing to do the local downaload step. [10:32] (Firefox doesn't exactly have a 'Save download to remote FTP...." option [10:32] Hmm... Goes off to file a bug report... [10:33] Shakespea: closest thing is archivebot but that'll only work for the wayback [10:33] Shakespea: and what you're describing isn't really possible :P [10:33] joepie91: Does archive team have an FTP server for donations? [10:33] 'donations' as in 'data to upload to IA'? [10:33] As in upload to 'dark-archive' [10:34] I don't really understand the question, then [10:34] like, what are you trying to accomplish? [10:34] What I am trying to ask for is a means to tell firefox to tell a remote server to send the file being downloaded to a third party server rather than to save it locally [10:35] So that when Firefox starts a download , the third party server takes over and capture the download, rather than it being saved locally [10:35] Shakespea: that isn't a technical possibility. [10:36] as in, there is no way to do it with a standard FTP server, because it's just not a part of the FTP protocol [10:36] What's the technical limitation? [10:36] this requires explicit cooperation from the systems in question [10:36] and that cooperation isn't there [10:36] (and FXP only works with FTP and only due to a weird design decision in the FTP protocol) [10:37] Shakespea: it's not a 'limitation', it just isn't a thing that FTP software is written to do [10:37] The alternative would be to figure out if there's a way to inform the third party site to iniate an FTP transfer remotely. [10:37] (albiet based on metadata captured from Firefox) [10:37] Shakespea: there's nothing to 'capture', really. [10:37] it's just a URL that results in a file [10:37] that's it [10:37] bunch of HTTP headers [10:37] which is what WARC is for, and what archivebot saves [10:38] but that's of no relevance when you upload something as a separate item on IA [10:38] So technically, the WARC should include the ISO download? [10:38] tl;dr there's archivebot that can archive things on your behalf, and that's about it [10:38] but it doesn't create IA items [10:38] it just adds it to the collective archivebot archives that go into the wayback machine [10:38] I think we may be getting confused [10:39] The issue is how to ensure something that could be downloaded gets archived, without having to save it locally... [10:39] (even when the download is an HTTP/FTP transaction iniated by some sort of script) [10:40] Shakespea: you can't just magically make that happen. [10:40] Shakespea: there needs to be a system somewhere that is set up to do precisely this [10:40] that system is archivebot and it only does it in a very specific way [10:40] that's it [10:40] Ah [10:40] :( [10:40] Bother [10:40] So something get's lost because the tools don;t yet exist... [10:40] Pity [10:40] * Shakespea out [10:40] *** Shakespea has left [10:42] *** Shakespea has joined #archiveteam-bs [10:42] Okay the ISO is in wayback [10:42] :) [10:43] But it's slightly convoluted how to access it :) [10:43] Thanks for your help... [10:43] Also going to file some bugzilla for firefox... [10:43] *** Shakespea has left [10:46] *** Shakespea has joined #archiveteam-bs [10:48] Shakespea: right, so I was going to mention: things aren't getting lost " [10:48] gah [10:48] Shakespea: right, so I was going to mention: things aren't getting lost "because tools don't exist" * [10:48] Okay, that was just me possibly over-reacting [10:48] it's just that while archivebot is useful for quickly saving a page or site, it doesn't really make sense to invest time into building such a system for miscellaneous IA uploads [10:48] because it would only fit a very narrow set of usecases [10:48] for a very narrow set of people [10:49] while being very complex to build and maintain [10:49] it's not worth the time, it's easier to just tell somebody to pay $3/month for a VPS and do their archiving from there [10:49] Can wget be configured to do a third party mount? [10:49] Assuming you have VPN setup...? [10:49] what do you mean? I don't think "third party mount" is an existing term [10:50] Server X has a file... Server Y is a third party FTP server. Server Z is me... [10:50] Shakespea: I think you need to spend a bit of time learning about the various protocols and how networks work [10:50] Can wget be invoked such that it transfers from Server X to Server X even though the wget command is invoked on Server Z? [10:50] because that suggestion doesn't really make sense if you understand the protocols involved [10:51] and just wildly suggesting impossible things is going to drain a lot of time and attention from people who can otherwise be busy doing productive things [10:51] I will shut up then [10:51] Shakespea: I'm not suggesting that you shut up, I'm suggesting that you learn ;) [10:51] documentation of HTTP and network stacks is fairly widely available [10:52] even just a high-level understanding of what it's for and how it works, will help answer your question [10:52] joepie91: Okay then [10:52] you don't need to know the nitty gritty details [10:52] Seems I have some reading to do [10:52] And as I've got an adequate answer... [10:52] * Shakespea away [10:52] *** BlueMaxim has quit IRC (Quit: Leaving) [11:05] *** Shakespea has left [11:34] *** GE has joined #archiveteam-bs [12:03] *** RichardG has quit IRC (Read error: Operation timed out) [12:03] *** RichardG has joined #archiveteam-bs [12:31] *** RichardG has quit IRC (Read error: Operation timed out) [12:31] *** RichardG has joined #archiveteam-bs [12:33] *** Mathias` has joined #archiveteam-bs [12:35] *** Madthias has quit IRC (Ping timeout: 362 seconds) [13:03] *** RichardG has quit IRC (Read error: Operation timed out) [13:03] *** RichardG has joined #archiveteam-bs [13:20] *** brayden has quit IRC (Read error: Connection reset by peer) [13:23] *** pikhq has joined #archiveteam-bs [13:27] *** pikhq_ has quit IRC (Ping timeout: 632 seconds) [13:39] *** pikhq has quit IRC (Read error: Operation timed out) [13:42] *** pikhq has joined #archiveteam-bs [13:52] *** brayden has joined #archiveteam-bs [13:52] *** swebb sets mode: +o brayden [13:52] *** JesseW has joined #archiveteam-bs [13:53] *** GLaDOS has quit IRC (Oh crap, I died.) [13:53] *** GLaDOS has joined #archiveteam-bs [14:00] *** robink has quit IRC (Read error: Operation timed out) [14:05] *** pikhq has quit IRC (Ping timeout: 250 seconds) [14:06] *** pikhq has joined #archiveteam-bs [14:11] *** pikhq has quit IRC (Ping timeout: 244 seconds) [14:13] *** pikhq has joined #archiveteam-bs [14:17] *** Start has quit IRC (Quit: Disconnected.) [14:19] *** pikhq has quit IRC (Ping timeout: 244 seconds) [14:25] *** brayden_ has joined #archiveteam-bs [14:25] *** swebb sets mode: +o brayden_ [14:26] *** Aoede has joined #archiveteam-bs [14:28] *** pikhq has joined #archiveteam-bs [14:31] *** brayden has quit IRC (Read error: Operation timed out) [15:04] *** RichardG has quit IRC (Read error: Operation timed out) [15:04] *** RichardG has joined #archiveteam-bs [15:10] *** brayden_ has quit IRC (Read error: Connection reset by peer) [15:38] *** RichardG has quit IRC (Read error: Operation timed out) [15:38] *** RichardG has joined #archiveteam-bs [15:46] *** brayden has joined #archiveteam-bs [15:46] *** swebb sets mode: +o brayden [15:51] *** JesseW has quit IRC (Ping timeout: 370 seconds) [16:24] https://search.slashdot.org/story/16/10/14/1318244/ [16:52] *** RichardG has quit IRC (Read error: Operation timed out) [16:52] *** RichardG has joined #archiveteam-bs [17:38] *** robink has joined #archiveteam-bs [17:38] *** Stiletto has quit IRC () [17:58] *** Aoede has quit IRC (Quit: WeeChat 1.5) [18:01] *** dashcloud has quit IRC (Read error: Operation timed out) [18:04] *** dashcloud has joined #archiveteam-bs [18:14] *** Stiletto has joined #archiveteam-bs [18:43] *** RichardG has quit IRC (Read error: Operation timed out) [18:43] *** RichardG has joined #archiveteam-bs [19:11] *** RichardG has quit IRC (Read error: Operation timed out) [19:11] *** RichardG has joined #archiveteam-bs [19:38] *** pikhq has quit IRC (Ping timeout: 633 seconds) [19:40] *** Shakespea has joined #archiveteam-bs [19:40] Good evening [19:40] Anyone know of UK places that will do mass imaging - https://archive.org/post/1064324/old-pc-plus-cds-and-bbc-micro ? [19:41] *** pikhq has joined #archiveteam-bs [19:51] *** robink has quit IRC (Ping timeout: 246 seconds) [19:51] *** robink_ has joined #archiveteam-bs [20:09] *** robink_ has quit IRC (Ping timeout: 506 seconds) [20:12] *** Start has joined #archiveteam-bs [20:33] *** RichardG has quit IRC (Read error: Operation timed out) [20:33] *** RichardG has joined #archiveteam-bs [21:00] *** RichardG has quit IRC (Read error: Operation timed out) [21:00] *** RichardG has joined #archiveteam-bs [21:16] *** Shakespea has left [21:27] *** RichardG has quit IRC (Read error: Operation timed out) [21:27] *** RichardG has joined #archiveteam-bs [21:29] *** Start has quit IRC (Quit: Disconnected.) [21:35] *** Stiletto has quit IRC (Ping timeout: 370 seconds) [21:35] *** robink has joined #archiveteam-bs [21:42] *** robink has quit IRC (Ping timeout: 260 seconds) [21:45] *** robink has joined #archiveteam-bs [21:50] *** robink has quit IRC (Ping timeout: 246 seconds) [22:07] *** robink has joined #archiveteam-bs [22:11] *** schbirid has quit IRC (Quit: Leaving) [22:13] *** BlueMaxim has joined #archiveteam-bs [22:26] *** Start has joined #archiveteam-bs [22:32] *** fie has quit IRC (Read error: Connection reset by peer) [22:33] *** fie has joined #archiveteam-bs [22:40] *** GE has quit IRC (Quit: zzz) [23:49] *** powerKitt has joined #archiveteam-bs [23:49] Hey! I was wondering if there was a script that could go through a Tumblr tag ( https://www.tumblr.com/tagged/sbarg for example) and output the url of every user that posted in it to a TXT file. I'm an ARG archiver, and it'd really be useful if I could automate the process of finding all the old blogs of players [23:50] yes we heard you in other channel [23:50] powerKitt: i am not aware of such a script [23:51] Sorry, I didn't see any responses... [23:58] there weren't any but that's because nobody who is awake and looking at irc had a positive answer