[00:57] *** hook54321 has joined #webroasting [01:39] *** logchfoo1 starts logging #webroasting at Sun Sep 16 01:39:22 2018 [01:39] *** logchfoo1 has joined #webroasting [02:23] *** JAA has joined #webroasting [02:24] *** bakJAA sets mode: +o JAA [02:27] *** hook54321 has quit IRC (se.hub irc.underworld.no) [02:44] http://studenten.freepage.de/meph/ascii/ [03:36] Raccoon, I looked at your Paste sites list, it is very impressive. We should try enumerating the URLs like terroroftinytown does. [03:36] If you can script something to better tidy and test them, that'd be awesome. [03:36] I'd pulled a great deal of them from my IRC logs over 2 decades [03:37] then creative google/duckduck searches for those with common codebase [03:39] Raccoon, even if I script something I will not have enough resources to run this myself unless I have access to a list of all pastes on the website. The reason why is because some websites rate limit requests. [03:39] oh. right. [03:39] The solution is to distribute requests across many IP addresses. [03:39] but I mean, at the very least, to ennumerate their existence, birthday, deathday, etc [03:40] wikipage [03:40] I am not sure what you are talking about. [03:40] then you know where to look on archive.org [03:40] wikipage like @ http://archiveteam.org/index.php?title=ISP_Hosting [03:40] call it title=Paste_Hosting [03:41] I think what you talking about is migrating the list you are maintaining to the archiveteam wiki. [03:41] correct. [03:41] re: "I'm tired of trying to maintain it by hand." [03:42] OK, now I understand the confusion. [03:42] By enumerating the URLs I mean we figure out how the ID system increments on each website. [03:42] Then we attempt to request each paste and see if it exists. [03:42] This is how terroroftinytown works. [03:43] aye. a great many or most of them expire posts after some period, and indexes are now mostly all hidden due to SEO spam companies. in theory, archive.org would have most of the index-era archived [03:44] SEO companies have made plain their business model involves spamming paste sites and random web forums. [03:44] archive.org will only have the pastes if they were posted to a URL that the archive.org web crawller can access. [03:45] There are some pastes that are public by virture of not being password protected but they are not archived because archive.org's crawler never saw the URL. [03:45] indeed [03:45] and today that's mostly true since so many that had "recent pastes" indexes, no longer support that [03:47] To answer your original question about migrating to the wiki I recommend you check out pandoc. [03:47] It can convert your markdown to mediawiki syntax. [03:48] Then you should be able to copy and paste the mediawiki into a new page. [03:50] I ran it on paste.md and it worked. [03:50] Would you like me to post it to the archiveteam wiki? [04:03] *** zhongfu has joined #webroasting [04:04] Yeah, have at [04:05] It could probably be reformatted better still [04:10] *** zhongfu has quit IRC (Remote host closed the connection) [04:14] *** zhongfu has joined #webroasting [04:22] Done: https://www.archiveteam.org/index.php?title=Paste_hosting note I am not going to change any of the text or formatting. [04:22] I do not understand the in progress section. What is in progress? [04:41] *** kiskabak2 has joined #webroasting [04:42] *** kiska1 has joined #webroasting [04:42] *** kiska has joined #webroasting [04:53] i was in progress of sorting and cataloguing those next [04:53] general notes / extra stuff [06:16] *** jut_ has joined #webroasting [10:54] *** zhongfu has quit IRC (Ping timeout: 260 seconds) [10:58] *** zhongfu has joined #webroasting [22:05] *** zhongfu has quit IRC (Remote host closed the connection) [22:06] *** zhongfu has joined #webroasting