#webroasting 2016-03-11,Fri

↑back Search

Time Nickname Message
00:17 🔗 Start has joined #webroasting
01:35 🔗 chfoo has quit IRC (Ping timeout: 499 seconds)
01:35 🔗 chfoo has joined #webroasting
01:36 🔗 svchfoo3 sets mode: +o chfoo
02:08 🔗 Start has quit IRC (Read error: Connection reset by peer)
02:08 🔗 Start has joined #webroasting
02:20 🔗 dashcloud has quit IRC (Read error: Operation timed out)
02:23 🔗 dashcloud has joined #webroasting
02:41 🔗 chfoo0 has joined #webroasting
02:45 🔗 chfoo has quit IRC (Ping timeout: 260 seconds)
02:47 🔗 chfoo0 is now known as chfoo
15:11 🔗 Start has quit IRC (Quit: Disconnected.)
15:47 🔗 Start has joined #webroasting
17:08 🔗 Start has quit IRC (Quit: Disconnected.)
19:20 🔗 Start has joined #webroasting
20:26 🔗 Start so i'd like to start automating the web hosting project
20:26 🔗 Start probably through a warrior project
20:27 🔗 Start it would work something like this:
20:27 🔗 Start 1. get a list of urls from google, bing, etc.
20:27 🔗 Start *manually get
20:28 🔗 Start 2. put the urls into the project
20:28 🔗 Start 3. sites are grabbed and any other isp hosted sites linked to are grabbed as well
20:32 🔗 Start one issue is how sites with hidden subdirectories would be handled (e.g. http://homepage.ntlworld.com/ashen1/ and http://homepage.ntlworld.com/ashen1/ashen/)
20:34 🔗 Start discovery will be trivial for hostings with numerical usernames
20:35 🔗 Start a dictionary based method of discovery (common words and names with numbers) could work for the rest
20:36 🔗 Start maybe setup an irc bot where people can add lists of isp hosted sites to the grab
20:45 🔗 Start has quit IRC (Quit: Disconnected.)

irclogger-viewer