Time |
Nickname |
Message |
09:22
🔗
|
|
wessel151 has joined #webroasting |
09:24
🔗
|
wessel151 |
are ther plans to save ziggo home.nl |
09:25
🔗
|
wessel151 |
and chello.nl |
10:36
🔗
|
wessel151 |
@jaa |
10:36
🔗
|
wessel151 |
@JAA |
13:44
🔗
|
JAA |
wessel151: No specific plans yet, but it's on our radar, and we'll certainly try. |
14:16
🔗
|
JAA |
wessel151: Do you know if there is a list of webspaces/usernames somewhere? |
14:18
🔗
|
JAA |
Also, list of affected webspaces: http://members.casema.nl/USERNAME/ http://home.casema.nl/USERNAME/ http://members.home.nl/USERNAME/ http://members.quicknet.nl/USERNAME/ http://members.upc.nl/USERNAME/ http://members.chello.nl/USERNAME/ http://members.ziggo.nl/USERNAME/ |
14:29
🔗
|
wessel151 |
@JAA i am currently using reverse search on duckdockgo |
14:30
🔗
|
wessel151 |
just site:http://members.home.nl/ |
14:30
🔗
|
JAA |
Right |
14:31
🔗
|
JAA |
We could use a good Dutch word list of likely candidates for archival. |
14:31
🔗
|
JAA |
Here's one I used before for TalkTalk (UK): 'family' 'genealogy' 'club' 'society' 'clan' 'company' 'ltd' 'home' 'index' 'wedding' 'school' 'college' 'archive' 'history' 'document' 'church' 'band' 'manual' 'product' |
14:35
🔗
|
wessel151 |
you want a dutch dictionary of same sorts |
14:36
🔗
|
JAA |
Yeah, just words commonly used on websites that would be hosted on such ISP webspaces. |
14:43
🔗
|
wessel151 |
i found the fist 20.000 names already |
14:46
🔗
|
wessel151 |
JAA can i upload them somewhere |
14:47
🔗
|
wessel151 |
maby a new github project |
14:58
🔗
|
JAA |
wessel151: Do you mean the word list or usernames? |
14:59
🔗
|
wessel151 |
word list |
14:59
🔗
|
JAA |
Hmm, yeah, it has to be quite small to be effective. I want to use it for search engine scraping. |
15:00
🔗
|
JAA |
Bing's the only search engine that lets you persistently scrape, but only at a very slow pace. |
15:02
🔗
|
JAA |
So something like a few dozen to maybe a couple hundred words. |
15:26
🔗
|
wessel151 |
JAA: can we use this https://gathering.tweakers.net/forum/find?keyword=http%3A%2F%2Fmembers.home.nl%2F#filter:q1bKTq0szy9KUbJSyigpKbCK0Y_Rz03NTUotKtbLyM9N1cvLidFX0lEqSExPVbIyhDCCM6tAHAODWgA |
15:34
🔗
|
JAA |
wessel151: Sure. Someone just needs to write a scraper for it. |
15:36
🔗
|
wessel151 |
i can do it manualy |
15:36
🔗
|
wessel151 |
ist verry small |
17:04
🔗
|
|
Ryz has joined #webroasting |
17:13
🔗
|
|
dock has joined #webroasting |
17:19
🔗
|
|
dock has quit IRC (Quit: Page closed) |
18:08
🔗
|
|
qwebirc12 has joined #webroasting |
18:08
🔗
|
qwebirc12 |
test |
18:08
🔗
|
|
qwebirc12 has left |
18:09
🔗
|
|
wessel152 has joined #webroasting |
20:46
🔗
|
wessel152 |
JAA can you do something with this https://cse.google.nl/cse?cx=013325157016033957817:6qqkwllcjas |
20:46
🔗
|
wessel152 |
i can make api cals to it |
21:07
🔗
|
|
wessel152 has quit IRC (Ping timeout: 260 seconds) |
23:07
🔗
|
|
logchfoo1 starts logging #webroasting at Mon Mar 02 23:07:06 2020 |
23:07
🔗
|
|
logchfoo1 has joined #webroasting |