[00:00] *** nightpool has joined #archiveteam [00:04] *** vxbinaca has joined #archiveteam [00:05] hi I have 32 gigs of Docstoc sitting in a secondary vdmk file and i'd like to manually force an rsync of it, how do I do this? [00:15] Always [00:17] *** ndiddy has quit IRC (Read error: Operation timed out) [00:20] *** ndiddy has joined #archiveteam [00:20] *** WinterFox has quit IRC (Read error: Operation timed out) [00:23] *** nightpool has quit IRC (Ping timeout: 255 seconds) [00:31] *** WinterFox has joined #archiveteam [00:41] *** Start_ has joined #archiveteam [00:41] *** Start has quit IRC (Read error: Connection reset by peer) [00:48] *** nightpool has joined #archiveteam [00:50] http://www.theverge.com/2015/12/1/9832780/yahoo-considering-sale-of-internet-business-report [00:50] *** Start_ is now known as Start [01:11] *** DMackey- has joined #archiveteam [01:13] *** DMackey has quit IRC (Ping timeout: 310 seconds) [01:17] *** vxbinaca has quit IRC (Remote host closed the connection) [01:27] *** DMackey- is now known as DMackey [01:31] *** nightpool has quit IRC (Ping timeout: 183 seconds) [01:34] *** kyan has joined #archiveteam [01:42] *** Ghost_of_ has quit IRC (Quit: Leaving) [01:55] *** primus105 has quit IRC (Leaving.) [02:14] *** JesseW has joined #archiveteam [02:45] *** W1nterFox has joined #archiveteam [02:48] *** WinterFox has quit IRC (Read error: Operation timed out) [03:12] *** ndiddy has quit IRC (Quit: Leaving) [03:12] *** ndiddy has joined #archiveteam [03:31] *** nightpool has joined #archiveteam [03:37] damn [03:37] that does not look good... [03:37] cc SketchCow arkiver [03:37] what else does yahoo have [03:37] too much [03:38] xmc: https://everything.yahoo.com/ [03:38] they have provided a helpful and aptly-named index [03:38] What [03:38] ahh right [03:38] nice [03:38] (per country) [03:38] yahoosucks [03:38] SketchCow: "Yahoo will consider putting itself or its core businesses up for sale, with the possibilities being discussed at a series of board meetings being held later this week, according to The Wall Street Journal. The options on the table reportedly include selling off Yahoo's internet business, spinning off its shares in Alibaba, or doing both." [03:38] YES [03:38] possible impending extra-doom [03:38] Do you possibly think I don't get told immediately on every yahoo fart [03:38] then again, it would probably be the doom to end all dooms [03:38] I need a better nemesis [03:38] heh [03:38] * SketchCow slaps midas on the back of the head [03:39] LET'S GO [03:39] * SketchCow and midas fighting in the lobby [03:45] *** nightpool has quit IRC (Read error: Connection reset by peer) [03:48] *** nightpool has joined #archiveteam [03:49] *** vitzli has joined #archiveteam [03:53] *** JesseW has quit IRC (Read error: Connection reset by peer) [03:53] *** JesseW has joined #archiveteam [04:02] *** JesseW has quit IRC (Leaving.) [04:08] *** JesseW has joined #archiveteam [04:29] *** bwn has quit IRC (Read error: Operation timed out) [04:31] *** JesseW has quit IRC (Leaving.) [04:40] !ao https://www.facebook.com/notes/mark-zuckerberg/a-letter-to-our-daughter/10153375081581634 [04:40] fuck [04:43] *** wyatt8740 has quit IRC (Read error: Operation timed out) [04:48] *** JesseW has joined #archiveteam [04:50] *** wyatt8740 has joined #archiveteam [04:57] *** JesseW has quit IRC (Leaving.) [05:11] *** ndiddy has quit IRC (Remote host closed the connection) [05:11] *** aaaaaaaaa has quit IRC (Leaving) [05:19] *** DMackey- has joined #archiveteam [05:25] *** DMackey has quit IRC (Read error: Operation timed out) [05:31] *** Ungstein has quit IRC (Quit: Leaving.) [05:31] *** Ungstein has joined #archiveteam [05:51] *** RichardG has quit IRC (Ping timeout: 252 seconds) [05:52] *** JesseW has joined #archiveteam [05:58] *** Sk1d has quit IRC (Ping timeout: 252 seconds) [06:05] *** bwn has joined #archiveteam [06:15] *** RichardG has joined #archiveteam [06:25] *** Sk1d has joined #archiveteam [06:44] *** W1nterFox has quit IRC (Read error: Operation timed out) [06:49] *** WinterFox has joined #archiveteam [06:49] *** WinterFox has quit IRC (Connection closed) [06:52] *** WinterFox has joined #archiveteam [07:38] *** xk_id has joined #archiveteam [07:58] http://imgur.com/gallery/bI9kiD1 [08:04] Wrt Yahoo: I should note that I’ve been grabbing and uploading Yahoo Groups for a few months now. [08:05] PurpleSym: good [08:05] Would be good to have a collection for the stuff though, see https://archive.org/details/@purplesymphony [08:10] *** wp494 has quit IRC (Quit: LOUD UNNECESSARY QUIT MESSAGES) [08:12] *** nightpool has quit IRC (Read error: Operation timed out) [08:18] *** nightpool has joined #archiveteam [08:19] *** Fusl has quit IRC (Quit: Contact: http://hallowe.lt/) [08:22] *** Fusl has joined #archiveteam [08:24] *** wp494 has joined #archiveteam [08:27] *** vitzli has quit IRC (Leaving) [08:33] *** atomotic has joined #archiveteam [08:36] *** nightpool has quit IRC (Quit: Lost terminal) [08:42] *** JesseW has quit IRC (Leaving.) [08:43] *** schbirid has joined #archiveteam [08:45] *** primus104 has joined #archiveteam [08:57] *** xk_id has quit IRC (Remote host closed the connection) [09:08] *** primus105 has joined #archiveteam [09:14] *** Elegance has quit IRC (Ping timeout: 606 seconds) [09:14] *** primus104 has quit IRC (Read error: Operation timed out) [09:35] So, yahoo might be shutting everything down now [09:36] *** Ghost_of_ has joined #archiveteam [09:36] *** Ghost_of_ has quit IRC (Connection closed) [09:39] *** primus105 has quit IRC (Leaving.) [09:42] *** bwn has quit IRC (Read error: Operation timed out) [09:52] flickr ? [09:54] i think going after older accounts on flicker maybe a good idea [09:55] *** Ghost_of_ has joined #archiveteam [10:08] *** bwn has joined #archiveteam [10:28] *** luckcolor has joined #archiveteam [10:28] hello [10:44] greetings [10:48] *** Ghost_of_ has quit IRC (Remote host closed the connection) [10:49] *** BlueMaxim has quit IRC (Read error: Connection reset by peer) [10:50] *** xk_id has joined #archiveteam [10:53] *** Smiley has quit IRC (Read error: Operation timed out) [10:54] *** Smiley has joined #archiveteam [10:55] *** Smiley has quit IRC (Client Quit) [10:56] *** Smiley has joined #archiveteam [11:07] *** primus104 has joined #archiveteam [11:20] *** xk_id has quit IRC (Remote host closed the connection) [11:31] *** maseck_ has quit IRC (Remote host closed the connection) [11:31] *** SN4T14 has quit IRC (Read error: Operation timed out) [11:40] *** SN4T14 has joined #archiveteam [11:40] *** vitzli has joined #archiveteam [11:40] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [11:56] *** godane has quit IRC (Quit: Leaving.) [12:29] *** primus104 has quit IRC (Leaving.) [12:31] *** WinterFox has quit IRC (Remote host closed the connection) [12:39] *** xk_id has joined #archiveteam [12:54] *** maseck has joined #archiveteam [13:04] *** luckcolor has quit IRC (Quit: Leaving) [13:13] *** atomotic has joined #archiveteam [13:21] *** SN4T14 has quit IRC (Read error: Operation timed out) [13:31] *** MMovie has joined #archiveteam [13:33] *** MMovie1 has quit IRC (Ping timeout: 310 seconds) [13:36] *** SN4T14 has joined #archiveteam [14:24] *** johtso has joined #archiveteam [14:29] *** primus104 has joined #archiveteam [15:11] So we should get the projects page updated again [15:11] Add the recently announced shutdown to Proposed projects [15:11] So we can keep track of all these websites [15:11] Yahoo is unlikely to shut EVERYTHING down [15:11] But they are likely to invent even more services that will go away [15:12] Which survived previous hatchets. [15:12] Flickr is probably going to stay for some time [15:12] I think Butterfield would jump on that one now. [15:12] Slack is making him very rich. [15:16] So we have the FTP project running, the wiki grab can grab all external links from mediawikis and will soon be able to also grab full mediawikis [15:16] SketchCow: was there anything else besides FTP and wikis that we wanted to get started? [15:16] We already have 1.5 TB of FTP WARCs saved on FOS [15:29] You mean, speculative scrapes? [15:30] I have one. [15:30] I have one that needs help, programming, etc. [15:30] It's important, it was dropped, and I think Archive Team needs to step in. [15:31] *** godane has joined #archiveteam [15:36] FTP is now being uploaded to archive.org. [15:37] SketchCow: i have a 804 items waiting to be derive [15:41] *** Start has quit IRC (Quit: Disconnected.) [15:42] SketchCow: what is it? [15:45] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [15:53] The google play store downloader. [15:53] I can talk about it later, going out to cube [15:54] SketchCow: we'll talk later about it then! [15:55] FTP sites from grab are now uploading to archive.org. [15:56] Thanks! https://archive.org/details/archiveteam_ftp [16:01] *** pixman has joined #archiveteam [16:01] *** primus104 has quit IRC (Leaving.) [16:31] *** terburg has joined #archiveteam [16:34] *** Morbus has joined #archiveteam [16:39] *** vitzli has quit IRC (Quit: Leaving) [17:09] *** xk_id_ has joined #archiveteam [17:09] *** xk_id has quit IRC (Read error: Connection reset by peer) [17:32] *** primus104 has joined #archiveteam [18:00] *** NMNN has joined #archiveteam [18:06] *** JW_work has quit IRC (Read error: Operation timed out) [18:07] *** philpem has joined #archiveteam [18:08] *** terburg has quit IRC (terburg) [18:12] *** JW_work has joined #archiveteam [18:26] *** bwn has quit IRC (Read error: Connection reset by peer) [18:29] *** Start has joined #archiveteam [18:31] i have another idea for a long term warrior project [18:33] i'd like to preemptively start grabbing all the isp and university web hosting sites discovered on #webroasting [18:34] i think there's more isps to discover [18:34] whats on there is just scratching the surface [18:35] i agree, especially when it comes to university web hosting [18:35] there are way more than what we have here: http://archiveteam.org/index.php?title=University_Web_Hosting [18:36] but still, it would be good to start grabbing from what we've already discovered, especially from the isps [18:37] i'd like to avoid another home.roadrunner.com type deletion from happening again [18:38] we'd probably have to check every site every now and then after downloading it in case it gets updated [18:38] maybe worth seeing if we can contact the isps or unis and seeing if we can get a listing from them, probably doubtful [18:39] i know a few of them have public lists available [18:40] we could also scrape as we download sites to discover more of them [18:41] a lot of them sites will be small [18:41] but some will redirect to proper domains [18:41] *** Boltsie__ has quit IRC (Remote host closed the connection) [18:41] *** zyphlar has quit IRC (Remote host closed the connection) [18:41] *** JSharp___ has quit IRC (Remote host closed the connection) [18:42] *** tjg has joined #archiveteam [18:42] http://forum.battleclinic.com/index.php?topic=189269.0 [18:47] I support Start's idea. [18:52] *** JSharp___ has joined #archiveteam [18:52] Start++ [18:54] *** bwn has joined #archiveteam [18:55] *** JW_work has quit IRC (Quit: Leaving.) [18:55] *** xk_id_ has quit IRC (Remote host closed the connection) [18:55] *** JW_work has joined #archiveteam [19:04] *** zyphlar has joined #archiveteam [19:19] *** sivoais has quit IRC (Quit: leaving) [19:19] *** Start has quit IRC (Quit: Disconnected.) [19:33] *** kevin has quit IRC (Remote host closed the connection) [19:33] *** karissa__ has quit IRC (Read error: Connection reset by peer) [19:36] *** Start has joined #archiveteam [19:40] *** Boltsie__ has joined #archiveteam [19:47] *** aaaaaaaaa has joined #archiveteam [19:51] *** ndiddy has joined #archiveteam [19:54] *** deathy___ has quit IRC (Remote host closed the connection) [19:59] *** ironman_ has quit IRC (Remote host closed the connection) [19:59] *** filippo__ has quit IRC (Remote host closed the connection) [19:59] *** johtso has quit IRC (Remote host closed the connection) [19:59] *** antonizoo has quit IRC (Remote host closed the connection) [19:59] *** Ctrl-S___ has quit IRC (Write error: Broken pipe) [19:59] *** _desu___ has quit IRC (Write error: Broken pipe) [20:02] *** deathy___ has joined #archiveteam [20:02] *** kevin has joined #archiveteam [20:03] *** johtso has joined #archiveteam [20:10] *** Start_ has joined #archiveteam [20:10] *** Start has quit IRC (Read error: Connection reset by peer) [20:11] *** chfoo has quit IRC (Read error: Operation timed out) [20:13] *** ironman_ has joined #archiveteam [20:22] *** karissa__ has joined #archiveteam [20:25] *** filippo__ has joined #archiveteam [20:27] *** chfoo has joined #archiveteam [20:33] *** scyther has joined #archiveteam [20:38] *** Ghost_of_ has joined #archiveteam [20:40] *** _desu___ has joined #archiveteam [20:45] *** Start_ has quit IRC (Quit: Disconnected.) [20:46] *** Ctrl-S___ has joined #archiveteam [20:48] *** asdf has joined #archiveteam [20:49] *** Start has joined #archiveteam [20:50] *** antonizoo has joined #archiveteam [21:12] *** schbirid has quit IRC (Quit: Leaving) [21:30] *** xk_id has joined #archiveteam [22:11] given the recent news with yahoo, we should look into the feasibility of archiving all of yahoo's products & services (estimate size of each service, determine how easy it would be to make warrior projects for each site, etc.) [22:11] #woohoo [22:12] it would be great to have a plan in case the fire spreads [22:12] *** balrog has quit IRC (Read error: Operation timed out) [22:12] *** johtso has quit IRC (Quit: Connection closed for inactivity) [22:13] *** DMackey has joined #archiveteam [22:16] first priority would be to update/expand the wiki article: http://archiveteam.org/index.php?title=Woohoo [22:16] *** DMackey- has quit IRC (Read error: Operation timed out) [22:17] i'd like to preemptively start grabbing all the isp and university web hosting sites discovered on #webroasting [22:17] let's do that! [22:18] *** balrog has joined #archiveteam [22:20] should we also try grabbing angelfire and lycos sites? [22:22] yes [22:22] http://advertising.yahoo.com/ could probably just be dumped into #archivebot [22:23] https://www.yahoo.com/autos — could be … interesting … to figure out how to scrape… [22:24] *** Lord_Nigh has quit IRC (Read error: Operation timed out) [22:24] https://www.yahoo.com/beauty also could probably go into #archivebot [22:25] http://info.yahoo.com/ can certainly go in #archivebot [22:26] https://downloads.yahoo.com/ — WTF? [22:27] https://local.yahoo.com/ — could be painful, but likely has a LOT of unique content [22:27] seems like a pain in the ass to do JW_work [22:27] Furries are possibly coming your way, SketchCow https://lulz.net/furi/res/3303531.html#3304103 [22:27] ;) [22:27] also what about yahooligans [22:28] (I'm dumping this here, rather than editing the wiki, because I'm supposed to be at work) [22:29] wtf yahooligans isn't around apparently [22:29] rip [22:29] *** Lord_Nigh has joined #archiveteam [22:30] *** scyther has quit IRC (Read error: Connection reset by peer) [22:31] that was closed awhile ago, I believe [22:32] what about yahoo chess [22:34] "Up until March 2014, Yahoo! Games included a popular internet chess server." [22:34] wtf yahoo [22:35] *** scyther has joined #archiveteam [22:35] apparently they do/did a web hosting service: https://help.yahoo.com/kb/SLN20583.html [22:35] http://webhosting.yahoo.com redirects to https://www.aabacosmallbusiness.com/webhosting?wcp=1 [22:36] many things from everything.yahoo.com and the wikipedia page have been added to http://archiveteam.org/index.php?title=Woohoo [22:36] Start, http://blog.geocities.institute/archives/3022 [22:37] people who paid for geocities plus had their site still up until 2014 iirc [22:39] *** xk_id has quit IRC (Remote host closed the connection) [22:43] *** bwn has quit IRC (Read error: Operation timed out) [22:45] *** K4k has joined #archiveteam [22:46] *** chfoo has quit IRC (Read error: Connection reset by peer) [22:47] Start: not news [22:55] *** BlueMaxim has joined #archiveteam [22:55] *** Start has quit IRC (Quit: Disconnected.) [22:56] *** Start-mob has joined #archiveteam [23:00] *** scyther has quit IRC (Quit: Leaving) [23:03] *** chfoo has joined #archiveteam [23:06] *** bwn has joined #archiveteam [23:10] *** NMNN has quit IRC (Quit: Leaving) [23:24] *** tjg has quit IRC (Quit: Quitting.) [23:24] *** tjg has joined #archiveteam [23:25] *** guest9525 has joined #archiveteam [23:27] *** Darkstar has quit IRC (http://quassel-irc.org - Chat comfortably. Anywhere.) [23:28] *** Darkstar has joined #archiveteam [23:28] *** Start-mob has quit IRC (Quit: Leaving) [23:29] *** K4k has quit IRC (Quit: WeeChat 1.3) [23:30] *** guest9525 has quit IRC (Ping timeout: 240 seconds) [23:50] *** Start has joined #archiveteam