Time |
Nickname |
Message |
18:22
🔗
|
deathy |
so how do new available tasks actually get generated/found? |
18:37
🔗
|
soultcer |
deathy: I generate them when there are only few left. For example on tinyurl, I am doing a complete sequential scan for all codes up to 6 letters |
18:38
🔗
|
soultcer |
On bitly or isgd I just let it check random codes with 6 (bitly) or 5 (isgd) characters |
18:39
🔗
|
soultcer |
owly and ur1ca are sequential as well. With owly we are still catching up, with ur1ca we are already caught up. For both of them we are fetching shorturls faster than people are creating them |
18:42
🔗
|
deathy |
that's interesting. no kind of scraping other things (twitter stream, other archiving efforts) for shortener links? (though no idea if that might be useful) |
18:43
🔗
|
soultcer |
swebb is fetching all shorturls from the twitter "spritzer" (a feed containing about 2% of all tweets) |
18:49
🔗
|
deathy |
cool |
18:49
🔗
|
deathy |
obviously this is a never-ending project :) |
18:52
🔗
|
soultcer |
Yes, unless all shorteners suddenly decide to shut down :D |
19:09
🔗
|
GitHub29 |
[tinyback] soult pushed 3 new commits to master: https://github.com/soult/tinyback/compare/d49d8ef80688...588c45c9fe83 |
19:09
🔗
|
GitHub29 |
tinyback/master 588c45c David Triendl: Bump version to 2.3 |
19:09
🔗
|
GitHub29 |
tinyback/master 5f1da45 David Triendl: services.Tinyurl: Try to get original URL from affiliate URL |
19:09
🔗
|
GitHub29 |
tinyback/master 7a6a820 David Triendl: Merge branch 'add-snipurl-service' |
19:10
🔗
|
soultcer |
ersi: I have set the rate limit for snipurl to 2 requests / second for now. I rather stay on the safe side than getting all scrapers banned |
19:17
🔗
|
deathy |
forgot to ask... any rules on multiple threads for tinyback? since requests/second limit and things... |
19:21
🔗
|
soultcer |
deathy: Nope, the tracker will take care of it. It automatically limits you to one thread per url shortener for each IP address |
19:24
🔗
|
deathy |
k. good to know |
23:31
🔗
|
ersi |
soultcer: Yeah, sounds good |