[00:00] *** closure_ has quit IRC (Read error: Operation timed out) [00:00] *** closure has joined #archiveteam-bs [00:01] *** Kaz has quit IRC (Read error: Operation timed out) [00:01] *** kyledrake has quit IRC (Read error: Operation timed out) [00:01] *** phuzion has quit IRC (Read error: Operation timed out) [00:01] *** second_ has quit IRC (Read error: Operation timed out) [00:02] *** kyledrake has joined #archiveteam-bs [00:02] *** Kaz has joined #archiveteam-bs [00:03] *** atomicthu has quit IRC (Read error: Operation timed out) [00:03] *** second has joined #archiveteam-bs [00:04] *** phuzion has joined #archiveteam-bs [00:04] *** Ryz has quit IRC (Ping timeout: 496 seconds) [00:04] *** pikami_ has quit IRC (Ping timeout: 496 seconds) [00:05] *** atomicthu has joined #archiveteam-bs [00:05] *** Larsenv has quit IRC (Read error: Operation timed out) [00:05] *** Larsenv has joined #archiveteam-bs [00:05] *** svchfoo3 has quit IRC (Ping timeout: 496 seconds) [00:05] *** pikami has joined #archiveteam-bs [00:06] *** Raccoon has joined #archiveteam-bs [00:08] *** antomati_ has joined #archiveteam-bs [00:08] *** Datechnom has quit IRC (Ping timeout: 496 seconds) [00:10] *** Hooloovoo has quit IRC (Read error: Operation timed out) [00:10] *** Hooloovoo has joined #archiveteam-bs [00:14] *** Jonimoose has quit IRC (Ping timeout: 496 seconds) [00:15] *** antomatic has quit IRC (Ping timeout: 496 seconds) [00:19] *** Jonimoose has joined #archiveteam-bs [00:47] *** SynMonger has quit IRC (Quit: Wait, what?) [00:51] *** SynMonger has joined #archiveteam-bs [01:08] *** svchfoo3 has joined #archiveteam-bs [01:08] *** svchfoo1 sets mode: +o svchfoo3 [01:08] *** Datechnom has joined #archiveteam-bs [01:21] *** Ryz has joined #archiveteam-bs [01:27] *** PovAddict has joined #archiveteam-bs [03:49] *** qw3rty_ has joined #archiveteam-bs [03:57] *** qw3rty has quit IRC (Read error: Operation timed out) [04:23] *** DogsRNice has quit IRC (Read error: Connection reset by peer) [04:44] *** prq has quit IRC (Read error: Connection reset by peer) [04:54] *** Stiletto has joined #archiveteam-bs [04:58] *** Stilett0 has quit IRC (Ping timeout: 272 seconds) [05:02] We saved and returned upcoming.org. [05:03] And it was bought back from Yahoo! and put back up [05:03] The business has been not in great shape, but he did do it. [05:03] *** Ctrl has quit IRC (Read error: Operation timed out) [05:06] *** duh has joined #archiveteam-bs [05:06] *** Stilett0 has joined #archiveteam-bs [05:08] *** Stiletto has quit IRC (Ping timeout: 260 seconds) [05:10] *** BnAboyZ has quit IRC (Ping timeout: 857 seconds) [05:13] *** legoktm has quit IRC (Read error: Connection reset by peer) [05:14] *** antomatic has joined #archiveteam-bs [05:15] *** BnAboyZ has joined #archiveteam-bs [05:16] *** SJon___ has quit IRC (Ping timeout: 857 seconds) [05:21] *** SJon___ has joined #archiveteam-bs [05:25] *** antomati_ has quit IRC (Read error: Operation timed out) [05:55] *** nuc has joined #archiveteam-bs [05:55] *** nuc is now known as Somebody2 [05:57] *** Ctrl has joined #archiveteam-bs [06:51] *** PovAddict has quit IRC (Quit: Konversation terminated!) [07:56] *** WalkFly has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [10:07] *** VADemon has joined #archiveteam-bs [10:24] *** VADemon has quit IRC (left4dead) [10:35] *** BeefyBoot has quit IRC (Quit: Connection closed for inactivity) [12:01] *** BlueMax has quit IRC (Quit: Leaving) [12:09] *** jshoard has joined #archiveteam-bs [12:37] *** Stiletto has joined #archiveteam-bs [12:37] *** Stilett0 has quit IRC (Ping timeout: 260 seconds) [12:39] *** yano has quit IRC (Quit: WeeChat, The Better IRC Client, https://weechat.org/) [12:39] *** yano has joined #archiveteam-bs [12:46] *** Raccoon has quit IRC (Ping timeout: 265 seconds) [14:11] *** Mateon1 has quit IRC (Remote host closed the connection) [14:11] *** Mateon1 has joined #archiveteam-bs [14:42] *** lennier1 has quit IRC (Read error: No route to host) [14:43] *** lennier1 has joined #archiveteam-bs [15:56] *** DogsRNice has joined #archiveteam-bs [16:14] *** Debiloid has joined #archiveteam-bs [16:15] Эщкере [16:15] *** drcd_ has joined #archiveteam-bs [16:15] Ыыы [16:16] Ваня Варя А Аня [16:17] Debiloid: How can we help you? [16:17] *** vanek_shn has joined #archiveteam-bs [16:17] где я [16:18] Ты здесь [16:18] о дибила здарова [16:18] каво [16:19] vanek_shn There people uses english [16:19] кстати [16:19] а мне насрать на инглиш [16:19] Jaa do you save archive.org.ua? [16:19] короче саня [16:19] Ukrainin site [16:19] скачай игру морхухн легенды картинга [16:21] Jaa, it is like Web-Archive but isnot open [16:21] Debiloid: Interesting website, thank you. I have never seen it before. We do not currently save it. Is it at risk? [16:21] *** drcd has quit IRC (Ping timeout: 745 seconds) [16:21] *** drcd_ is now known as drcd [16:21] заткнись пендосина [16:21] морхухн [16:22] саня как ник поменять [16:22] Jaa some pages are save very badly. vanek_shn later [16:22] jaa чмо с маленькой письькой [16:24] сука морхухн не качается [16:25] Jaa also there were pictires but now they are not [16:27] i love fuck cats [16:27] and dogs too [16:27] and JAA [16:27] *** JAA sets mode: +b *!*4dde6317@*.mibbit.com [16:27] *** vanek_shn was kicked by JAA (vanek_shn) [16:27] Jaa looks like they try to clear their site [16:28] Debiloid: Do you know who operates it? [16:29] Jaa there was email [16:29] Maybe just send him hello and ask him about pages and everything [16:30] Ah, yes, found it. [16:31] If he or she love web-archive he or she can work with you [16:31] It looks like the Internet Archive covered a part of the website in 2015. [16:31] With what tool? [16:31] By users? [16:32] Their internal crawling tool, Heritrix. [16:32] But that is almost certainly very incomplete. [16:33] But 2015 was 5 years ago [16:33] Yeah [16:33] There can be new pages [16:33] Now Ukraine has new president [16:35] Btw, it's just text. No links. Jaa it's not like Web-archive links dont work [16:38] Jaa when archive team will ask creator? Some pages looks like shit and not like originals. But older pages are not [16:38] Yes, I saw that. Makes it easier to archive. [16:38] I will look into it in more detail soon. [16:39] Its like another collection on web archive fir this site? Like for google plus? [16:39] Will there be collection [16:40] Maybe. It depends on how large it is in total. [16:40] I see numbers of 30 and 18 million pages mentioned on the website. [16:41] But if that is all text, it would not be very large. So maybe just one item on Internet Archive. [16:41] So will there be collections for webcitation and other? [16:41] It will be like archive.org/details/archive.org.ua? [16:42] Something similar to that, yes. [16:42] It will also be in the Wayback Machine at web.archive.org. [16:43] Similar? But like what? [16:43] I do not know yet. It depends on the size, when we download it, etc. [16:44] You will see it on https://web.archive.org/web/*/https://archive.org.ua/ when it is done. [16:44] *** Arcorann has quit IRC (Read error: Connection reset by peer) [16:45] But is it too big? Hee are archive-bots for nit very big sites [16:45] I do not know how big it is. Have to download it first. [16:46] Okey [16:46] hum https://archive.org.ua/robots.txt [16:47] the current robots.txt ban the index page [16:48] No, it is ineffective. [16:48] It's /search/example.org/, no query. [16:49] Jaa maybe just archive-bots? But this archive will not be in its collection [16:49] Maybe /search/?... was a previous website structure. The site worked differently in 2017. [16:49] https://archive.org.ua/s/?s=013.kiev.ua&d=2006-12 [16:49] nico_32_: /s/, not /search/ :-) [16:49] It will be in archive-bot collection? [16:49] yes [16:49] Debiloid: It is too big for ArchiveBot. [16:50] Okey [16:50] i was probably a /search/? => /s/? [16:50] refactor [16:50] Yeah, could be. [16:50] as the collection grown [16:50] Last page of the domain index is https://archive.org.ua/list/?&offset=464600 by the way. [16:51] It is like archive of archive [16:55] Jaa why https://community.arm.com/ isnt too big when it have abot 25 mln pages but archive.org.ua is too big? [16:55] When it have about 18 mln pages you said [16:59] A [16:59] Jaa? [17:01] Ff [17:03] Debiloid: community.arm.com is also too big for ArchiveBot, but I did not know that when I started it. [17:03] It still retrieves data, but it gets slow when a job is more than a few million URLs. [17:05] Lol [17:06] Matbe save witt many parts [17:06] Jaa [17:06] Don't worry, I will find a solution. [17:09] *** Debiloid has quit IRC (Quit: https://mibbit.com Online IRC Client) [17:14] *** Mateon1 has quit IRC (Ping timeout: 272 seconds) [17:15] *** Mateon1 has joined #archiveteam-bs [17:32] I'll throw the TechNet forums, wikis, and (already migrated and "archived") blogs into AB. It appears we never really properly covered those, and while Microsoft said they'll keep it read-only, we all know how that will go. [17:35] By the way, here's an overview of the steps: https://docs.microsoft.com/en-us/teamblog/msdn-technet-migration [17:36] Already outdated as it's from late last year. Gallery was supposed to disappear in June but is still online. [17:43] *** Debiloid has joined #archiveteam-bs [17:43] A [17:44] Jaa it also saved something related to google https://archive.org.ua/search/?s=Google [17:45] There is also lists of archives https://archive.org.ua/search/?s=archive [17:45] Ah, so this is what the robots.txt blocks, nico_32_. ^ [17:46] https://archive.org.ua/search/?s=arhiv also this for archives [17:46] It may be useful for collecting Ukrain archive-sites [17:46] Jaa? [17:47] Yes, and the domain list in general will be quite interesting to get a sample of the Ukranian WWW. [17:48] Some of them are russian i think [17:48] Sure, I was referring to the .ua domains. [17:49] Not all .ua domains are Ukrain and not all Ukrain domains are .ua [17:49] Yes, that's why I said "sample". [17:50] Hmmmm [17:52] https://archive.org.ua/search/?s=Bibl jaa related to Bible and libraries [17:54] Debiloid: We will archive the entire website anyway. [17:57] *** Debiloid has quit IRC (https://mibbit.com Online IRC Client) [18:02] Unsurprisingly, the MSDN and TechNet forums are huge. Both have 2-3 million threads. [18:04] And that's only the en-US ones. [18:11] *** DLoader has quit IRC (Quit: DLoader) [18:28] *** DLoader has joined #archiveteam-bs [18:56] JAA: and they already purged a lot of old content [18:58] nico_32_: You mean the relaunch in 2008? [18:59] when they removed win9x & 2k content [18:59] and anything older [19:02] Ah [19:09] Looks like there's overlap between the two. Some threads appear in both. [19:09] I also found https://social.microsoft.com/Forums/, which uses the same platform. [19:11] And there was also something at http://social.expression.microsoft.com/forums/ apparently, but that doesn't seem to exist anymore. [20:08] *** PovAddict has joined #archiveteam-bs [20:45] *** Ctrl has quit IRC (Ping timeout: 857 seconds) [21:30] *** HP_Archiv has joined #archiveteam-bs [21:31] *** Ctrl has joined #archiveteam-bs [21:32] *** HP_Archiv has quit IRC (Client Quit) [21:33] *** HP_Archiv has joined #archiveteam-bs [21:48] *** Mayeau has joined #archiveteam-bs [21:52] *** Mayonaise has quit IRC (Read error: Operation timed out) [21:58] *** HP_Archiv has quit IRC (Quit: Leaving) [22:02] *** DogsRNice has quit IRC (Ping timeout: 265 seconds) [22:22] *** superkuh has quit IRC (Read error: Connection reset by peer) [22:35] *** superkuh has joined #archiveteam-bs [22:46] *** Arcorann has joined #archiveteam-bs [22:47] *** Arcorann has quit IRC (Remote host closed the connection) [22:48] *** Arcorann has joined #archiveteam-bs [22:56] *** jshoard has quit IRC (Leaving)