#webroasting 2020-03-02,Mon

↑back Search

Time Nickname Message
09:22 🔗 wessel151 has joined #webroasting
09:24 🔗 wessel151 are ther plans to save ziggo home.nl
09:25 🔗 wessel151 and chello.nl
10:36 🔗 wessel151 @jaa
10:36 🔗 wessel151 @JAA
13:44 🔗 JAA wessel151: No specific plans yet, but it's on our radar, and we'll certainly try.
14:16 🔗 JAA wessel151: Do you know if there is a list of webspaces/usernames somewhere?
14:18 🔗 JAA Also, list of affected webspaces: http://members.casema.nl/USERNAME/ http://home.casema.nl/USERNAME/ http://members.home.nl/USERNAME/ http://members.quicknet.nl/USERNAME/ http://members.upc.nl/USERNAME/ http://members.chello.nl/USERNAME/ http://members.ziggo.nl/USERNAME/
14:29 🔗 wessel151 @JAA i am currently using reverse search on duckdockgo
14:30 🔗 wessel151 just site:http://members.home.nl/
14:30 🔗 JAA Right
14:31 🔗 JAA We could use a good Dutch word list of likely candidates for archival.
14:31 🔗 JAA Here's one I used before for TalkTalk (UK): 'family' 'genealogy' 'club' 'society' 'clan' 'company' 'ltd' 'home' 'index' 'wedding' 'school' 'college' 'archive' 'history' 'document' 'church' 'band' 'manual' 'product'
14:35 🔗 wessel151 you want a dutch dictionary of same sorts
14:36 🔗 JAA Yeah, just words commonly used on websites that would be hosted on such ISP webspaces.
14:43 🔗 wessel151 i found the fist 20.000 names already
14:46 🔗 wessel151 JAA can i upload them somewhere
14:47 🔗 wessel151 maby a new github project
14:58 🔗 JAA wessel151: Do you mean the word list or usernames?
14:59 🔗 wessel151 word list
14:59 🔗 JAA Hmm, yeah, it has to be quite small to be effective. I want to use it for search engine scraping.
15:00 🔗 JAA Bing's the only search engine that lets you persistently scrape, but only at a very slow pace.
15:02 🔗 JAA So something like a few dozen to maybe a couple hundred words.
15:26 🔗 wessel151 JAA: can we use this https://gathering.tweakers.net/forum/find?keyword=http%3A%2F%2Fmembers.home.nl%2F#filter:q1bKTq0szy9KUbJSyigpKbCK0Y_Rz03NTUotKtbLyM9N1cvLidFX0lEqSExPVbIyhDCCM6tAHAODWgA
15:34 🔗 JAA wessel151: Sure. Someone just needs to write a scraper for it.
15:36 🔗 wessel151 i can do it manualy
15:36 🔗 wessel151 ist verry small
17:04 🔗 Ryz has joined #webroasting
17:13 🔗 dock has joined #webroasting
17:19 🔗 dock has quit IRC (Quit: Page closed)
18:08 🔗 qwebirc12 has joined #webroasting
18:08 🔗 qwebirc12 test
18:08 🔗 qwebirc12 has left
18:09 🔗 wessel152 has joined #webroasting
20:46 🔗 wessel152 JAA can you do something with this https://cse.google.nl/cse?cx=013325157016033957817:6qqkwllcjas
20:46 🔗 wessel152 i can make api cals to it
21:07 🔗 wessel152 has quit IRC (Ping timeout: 260 seconds)
23:07 🔗 logchfoo1 starts logging #webroasting at Mon Mar 02 23:07:06 2020
23:07 🔗 logchfoo1 has joined #webroasting

irclogger-viewer