Time |
Nickname |
Message |
00:17
🔗
|
|
Start has joined #webroasting |
01:35
🔗
|
|
chfoo has quit IRC (Ping timeout: 499 seconds) |
01:35
🔗
|
|
chfoo has joined #webroasting |
01:36
🔗
|
|
svchfoo3 sets mode: +o chfoo |
02:08
🔗
|
|
Start has quit IRC (Read error: Connection reset by peer) |
02:08
🔗
|
|
Start has joined #webroasting |
02:20
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
02:23
🔗
|
|
dashcloud has joined #webroasting |
02:41
🔗
|
|
chfoo0 has joined #webroasting |
02:45
🔗
|
|
chfoo has quit IRC (Ping timeout: 260 seconds) |
02:47
🔗
|
|
chfoo0 is now known as chfoo |
15:11
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
15:47
🔗
|
|
Start has joined #webroasting |
17:08
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
19:20
🔗
|
|
Start has joined #webroasting |
20:26
🔗
|
Start |
so i'd like to start automating the web hosting project |
20:26
🔗
|
Start |
probably through a warrior project |
20:27
🔗
|
Start |
it would work something like this: |
20:27
🔗
|
Start |
1. get a list of urls from google, bing, etc. |
20:27
🔗
|
Start |
*manually get |
20:28
🔗
|
Start |
2. put the urls into the project |
20:28
🔗
|
Start |
3. sites are grabbed and any other isp hosted sites linked to are grabbed as well |
20:32
🔗
|
Start |
one issue is how sites with hidden subdirectories would be handled (e.g. http://homepage.ntlworld.com/ashen1/ and http://homepage.ntlworld.com/ashen1/ashen/) |
20:34
🔗
|
Start |
discovery will be trivial for hostings with numerical usernames |
20:35
🔗
|
Start |
a dictionary based method of discovery (common words and names with numbers) could work for the rest |
20:36
🔗
|
Start |
maybe setup an irc bot where people can add lists of isp hosted sites to the grab |
20:45
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |