#wikiteam 2012-03-26,Mon

Time	Nickname	Message
11:59 ^🔗	emijrp	hey guys
12:01 ^🔗	emijrp	i would like to scan the internet for mediawiki wikis
12:01 ^🔗	emijrp	sort by api or not api, size, and last edit (probably abandoned wikis)
12:02 ^🔗	emijrp	then download from most inactive to activest ones
12:02 ^🔗	emijrp	there is a wikicrawler, but the .csv is from 2009
12:05 ^🔗	ersi	sounds great
12:06 ^🔗	ersi	I'm working on a little webcrawler to just find <a href=""> links and extract whatever is in href="" - havn't come far yet, but I bet one could put in some "signatures" of a mediawiki page and make it find that as well
12:16 ^🔗	emijrp	http://www.lolcatbible.com/index.php?title=Main_Page
12:19 ^🔗	emijrp	https://www.google.es/#q=%22This+page+was+last+modified+on%22+%22This+page+has+been+accessed%22
12:19 ^🔗	emijrp	there is a way to avoid dupe domains in google results?
12:53 ^🔗	Nemo_bis	I guess that's a question for alard