[00:04] *** kristian_ has quit IRC (Leaving) [00:31] let's list all sites from turkish media [00:31] The sites that have sitemaps can be done with a warrior job [00:31] The others need to be done using archivebot (unless they have nice IDs) [00:36] Start: I think we should have a good look at every yahoo owned websites and maybe archive them all [00:39] we have a slightly outdated list at http://archiveteam.org/index.php?title=Woohoo [00:45] *** robink has quit IRC (Ping timeout: 246 seconds) [00:50] *** RichardG has quit IRC (Quit: Keyboard not found, press F1 to continue) [00:52] *** RichardG has joined #archiveteam [00:55] yes [01:06] i've updated it and listed sites recently dicussed here as "Archival being planned" [01:08] most of their news sites have sitemaps: https://www.yahoo.com/robots.txt [01:11] *** robink has joined #archiveteam [01:12] Start: nice [01:13] this is interesting: https://en.wikipedia.org/wiki/Yahoo!_Time_Capsule [01:15] by the time the time capsule gets opened in 2020, yahoo will have long been a relic of the past [01:15] *** pguth_ has quit IRC (Remote host closed the connection) [01:16] *** pguth_ has joined #archiveteam [01:20] *** DoomTay has quit IRC (Ping timeout: 268 seconds) [01:30] *** philpem has quit IRC (Ping timeout: 260 seconds) [01:32] *** philpem has joined #archiveteam [01:50] http://www.smh.com.au/business/seven-west-mulls-postdeal-yahoo7-options-20160726-gqe6ks.html [01:50] archiving yahoo7 news might be a good idea [01:53] *** JesseW has joined #archiveteam [01:53] arkiver: i've made a list of all archivable yahoo sites: https://gist.githubusercontent.com/PressStartandSelect/ecc41709cca89995bc2009c1747dc33f/raw/6e3ba2e8aa1dd3cfe970d287da24c2c9ab88a628/yahoo%2520sites [01:58] *** DoomTay has joined #archiveteam [02:14] *** robink has quit IRC (Ping timeout: 1208 seconds) [02:15] *** MMovie has joined #archiveteam [02:15] *** MMovie1 has quit IRC (Read error: Operation timed out) [02:23] *** BartoCH has quit IRC (Ping timeout: 260 seconds) [02:28] *** philpem has quit IRC (Ping timeout: 260 seconds) [02:29] *** robink has joined #archiveteam [03:14] *** metalcamp has joined #archiveteam [04:05] *** metalcamp has quit IRC (Read error: Operation timed out) [04:41] *** Sk1d has quit IRC (Ping timeout: 194 seconds) [04:47] *** Sk1d has joined #archiveteam [04:59] *** gen2 has quit IRC (Remote host closed the connection) [05:00] *** wp494 has quit IRC (Quit: LOUD UNNECESSARY QUIT MESSAGES) [05:06] *** Emcy has joined #archiveteam [05:06] *** Actium has quit IRC (Quit: Bye!) [05:08] *** zgrant has quit IRC (Leaving.) [05:11] *** Emcy_ has joined #archiveteam [05:12] *** Emcy_ has quit IRC (Read error: Operation timed out) [05:12] *** Emcy_ has joined #archiveteam [05:13] *** Emcy has quit IRC (Read error: Operation timed out) [05:19] *** ndiddy has quit IRC (Quit: Leaving) [05:32] *** pguth_ has quit IRC (Remote host closed the connection) [05:32] *** pguth_ has joined #archiveteam [05:36] *** pguth_ has quit IRC (Remote host closed the connection) [05:36] *** pguth_ has joined #archiveteam [05:37] *** pguth_ has quit IRC (Remote host closed the connection) [05:37] *** pguth_ has joined #archiveteam [05:56] *** VADemon has joined #archiveteam [05:57] arkiver: updated list: https://gist.githubusercontent.com/PressStartandSelect/ecc41709cca89995bc2009c1747dc33f/raw/8e790a0be60f3c636f358225f8448000fdc85a32/yahoo%2520sites [05:58] Polyvore? [05:58] Isn't that some kind of clothes site? [05:59] Also, I havne't looked, but maye there's a way for one to save his Yahoo mailbox? [06:01] If the pursut for Tumblr ends up being a warrior project, I imagine participants would want their machines to prioritize certain tumblrs, even though that might not be possible with the warrior's "engine" [06:02] Also, I sicced ArchiveBot on Yahoo Weather, though I have no idea how thorough it is [06:07] *** JesseW has quit IRC (Ping timeout: 370 seconds) [06:09] *** DoomTay has quit IRC (Quit: Page closed) [06:19] *** Aranje has quit IRC (Quit: Three sheets to the wind) [06:25] *** metalcamp has joined #archiveteam [07:20] *** Coderjoe has quit IRC (Read error: Operation timed out) [07:27] *** Coderjoe has joined #archiveteam [07:34] *** BlueMaxim has quit IRC (Quit: Leaving) [07:38] *** schbirid has joined #archiveteam [07:55] *** gen2 has joined #archiveteam [08:11] *** DiscantX has joined #archiveteam [08:38] *** BlueMaxim has joined #archiveteam [08:40] *** BartoCH has joined #archiveteam [08:42] *** kristian_ has joined #archiveteam [09:24] *** kristian_ has quit IRC (Leaving) [09:37] *** winterfox has joined #archiveteam [09:38] *** BlueMaxim has quit IRC (Quit: Leaving) [09:39] *** Coderjoe has quit IRC (Read error: Operation timed out) [09:42] *** atomotic has joined #archiveteam [09:44] *** JW_work has joined #archiveteam [09:49] *** JW_work1 has quit IRC (Ping timeout: 633 seconds) [09:57] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [10:04] *** Coderjoe has joined #archiveteam [10:21] *** DiscantX has quit IRC (Read error: Operation timed out) [11:40] *** Coderjoe has quit IRC (Read error: Operation timed out) [11:44] *** Coderjoe has joined #archiveteam [11:56] *** jimjim has joined #archiveteam [11:57] *** jimjim has quit IRC (Client Quit) [11:58] *** joepie91_ is now known as joepie91 [12:29] *** midas1 is now known as midas [12:36] *** Coderjoe has quit IRC (Ping timeout: 260 seconds) [12:45] *** zgrant has joined #archiveteam [12:52] *** Coderjoe has joined #archiveteam [13:04] *** winterfox has quit IRC (Ping timeout: 492 seconds) [13:11] *** ravetcofx has quit IRC (Remote host closed the connection) [13:23] *** Scuttle has left Leaving [13:38] *** ravetcofx has joined #archiveteam [14:04] *** ravetcofx has quit IRC (Remote host closed the connection) [14:49] *** pguth_ has quit IRC (Remote host closed the connection) [14:49] *** pguth_ has joined #archiveteam [15:07] *** GLaDOS has joined #archiveteam [15:28] *** DoomTay has joined #archiveteam [15:34] *** Coderjoe has quit IRC (Ping timeout: 260 seconds) [15:41] *** JesseW has joined #archiveteam [16:02] *** Coderjoe has joined #archiveteam [16:10] *** JesseW has quit IRC (Ping timeout: 370 seconds) [16:26] So....about the Woohoo list [16:27] I actually sicced ArchiveBot on https://www.yahoo.com/news/weather/ though I don't know how thorough the crawl was [16:33] Gmane might be going away: https://lars.ingebrigtsen.no/2016/07/28/the-end-of-gmane/ [16:34] SketchCow: He's offering the archive to anyone who wants to take over from him; you might want to contact him to get a copy for the IA [16:37] that;s bad news... [16:38] Yeah, deecades worth of open-source development mailing list archives [16:38] *** SmileyG has joined #archiveteam [16:40] oh shiv [16:40] * JW_work was just coming here to mention that [16:41] *** Smiley has quit IRC (Ping timeout: 250 seconds) [16:41] Isn't it mostly duplicated elsewhere? [16:41] marc, spinics, etc [16:41] Yeah, a lot of it at least [16:42] *** pfallenop has quit IRC (Read error: Operation timed out) [16:42] *** pfallenop has joined #archiveteam [16:42] He even says that if Mail Archive was as good in the past as it is today he might never have started Gmane [16:44] it would still be *really good* to take him up on the offer, as I'm CERTAIN not *all* of it is duplicated [16:44] yes, I agree [16:45] also, having it as a relocatable archive would allow for research, and easier duplication, which would be good [16:45] he doesn't say how many TB it is [16:45] I'd be surprised if it cracked even 1 [16:45] Since it's all text [17:00] *** aschmitz has quit IRC (Read error: Operation timed out) [17:07] *** VADemon has quit IRC (Read error: Connection reset by peer) [17:15] *** philpem has joined #archiveteam [17:16] *** aschmitz has joined #archiveteam [17:31] Tumblr rolling out ads on all blogs: https://techcrunch.com/2016/07/27/tumblr-teases-plan-to-introduce-ads-on-all-its-blogs/ [17:32] gmane might shut down https://lars.ingebrigtsen.no/2016/07/28/the-end-of-gmane/ [17:32] that's a good thing, right? Profitability and all that [17:32] schbirid: we saw [17:33] kk [17:33] Yeah, but we all know about how the Internet feels about ads [17:33] This could potentially be their "Digg" moment if they can find another comparable blogging platform [17:33] *if their users [17:34] I'd bet if they offerred a way to have authors of high profile tumblr blogs pay to remove ads they'd get some takers. [17:44] You know, every time a company suffers a loss, reddit will be like "They'll just raise the bills to the customers to make up for it" [19:12] *** tomwsmf has joined #archiveteam [19:26] *** ndiddy has joined #archiveteam [19:27] *** Actium has joined #archiveteam [20:01] *** kristian_ has joined #archiveteam [20:01] *** Coderjoe has quit IRC (Read error: Operation timed out) [20:03] Anyone who uses storage.harrycross.me for stuff - rsync is back [20:04] when the new DNS propagates [20:12] *** pguth_ has quit IRC (Remote host closed the connection) [20:12] *** pguth_ has joined #archiveteam [20:18] *** Coderjoe has joined #archiveteam [20:30] *** pguth_ has quit IRC (Remote host closed the connection) [20:30] *** pguth_ has joined #archiveteam [20:31] *** ndiddy has quit IRC (Read error: Connection reset by peer) [20:37] *** anjacks0n has joined #archiveteam [20:52] https://lars.ingebrigtsen.no/2016/07/28/the-end-of-gmane/ [20:52] ah already posted here [20:53] That's probably going to keep happening until something is initiated [21:00] *** schbirid has quit IRC (Quit: Leaving) [21:03] *** metalcamp has quit IRC (Read error: Operation timed out) [21:20] *** william34 has joined #archiveteam [21:20] i log in and the first thing i see is we're not archive.org :P [21:20] WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD [21:21] yahoosucks [21:21] Of course it is. [21:21] lol [21:23] (Srsly) [21:23] I know [21:23] Speaking of yahoo, tumblr looks like it's about to tumbl [21:24] yeap [21:24] welcome to archiveteam, william34 [21:25] lol [21:25] I mean, its yahoo. [21:25] Secondly, it was having problems 5 days ago. [21:25] Now freaking verizion buys it! [21:25] tumblr's always been kind of janky [21:25] by problems i mean finacial problems [21:25] *** kristian_ has quit IRC (Leaving) [21:26] they are almost at flickr levels [21:26] yea [21:26] whats up with yahoo hating es [21:26] do we have to freakin backup the letter e before yahoo starts using [21:26] yahoo closing geocities was the founding event of archiveteam [21:26] fuck yahoo [21:26] lol [21:27] yahoo = ded [21:27] btw the entierty of yahoo might die [21:27] *** anjacks0n has quit IRC (anjacks0n) [21:27] verizon bought it specifically for video [21:27] so yea [21:27] yahoo video? they closed that years ago [21:27] now archive.org has it [21:28] right [21:28] but [21:28] thats what they acuired them for [21:28] ???? [21:32] They bought yahoo for mobile video [21:32] Nothing else [21:32] Anything has even more of a chance to go dead. [21:33] * xmc sighs [21:34] so yea this is great news for people who dont want to back everything up at once /s [21:57] *** ndiddy has joined #archiveteam [22:00] *** pguth_ has quit IRC (Remote host closed the connection) [22:00] *** pguth_ has joined #archiveteam [22:05] Boop [22:05] Gmane handled. [22:12] IA taking it on? [22:32] https://dnshistory.org/ is still online [22:32] Going to update scripts and give it a second try soon [22:33] kool [22:59] I dunno. Pages are blank. [22:59] Some pages, anyway [22:59] hmm, yeah [22:59] front page is still working [22:59] Like https://dnshistory.org/points-to/mx/1/gmail-smtp-in.l.google.com. [23:00] is that working for you or not [23:01] Nope. It be blank [23:01] Even in source view [23:02] yes [23:02] gone anyway I guess [23:03] not project in that case [23:03] xmc ^ [23:03] hi [23:03] hi [23:03] yeah curl gives a ssl error for me [23:03] yeah [23:03] http: has Location: https://... [23:03] and then https://... has no connection [23:04] yeah, so it's gone [23:04] more unreachable [23:07] *** Stiletto has quit IRC (Read error: Operation timed out) [23:33] *** Stiletto has joined #archiveteam [23:58] *** Coderjoe_ has joined #archiveteam [23:58] *** Coderjoe has quit IRC (Read error: Operation timed out)