Time |
Nickname |
Message |
22:37
🔗
|
BiggieJon |
when will the urlteam project resume ? |
22:53
🔗
|
ersi |
BiggieJon: It's hard to say. But currently; "whenever someone starts it again". |
22:53
🔗
|
* |
ersi moves it up his TODO list |
22:55
🔗
|
BiggieJon |
is it an issue of how to scrape site without getting blocked /throttled ? |
22:56
🔗
|
ersi |
Nope. |
22:58
🔗
|
ersi |
soultcer started the URLTeam effort and ran everything. He's unfortunately unable to continue doing so - and flagged that someone needed to take over (in very good advance). |
22:59
🔗
|
ersi |
I havn't gotten around to setting the environment up and generating tasks yet unfortunately :-/ |
22:59
🔗
|
BiggieJon |
I only recently heard of archiveteam and the projects |
23:00
🔗
|
BiggieJon |
trying to do what I can to help, but I'm not a programmer |
23:01
🔗
|
ersi |
That's great. Well, it's not really programming that's needed in this project at the moment.. It's just that it's currently unmaintained and not up and running. |
23:01
🔗
|
BiggieJon |
i c |
23:02
🔗
|
BiggieJon |
trying to learn more about how these projects run |
23:02
🔗
|
BiggieJon |
I am the hardware manager for a large web hosting company, trying to figure out if therre is some way I could get some resources from my company |
23:03
🔗
|
ersi |
It's all mostly ad-hoc. But we're drifting further away from the "URLTeam sub-project" continously. Maybe this is a discussion for #archiveteam or #archiveteam-bs where there's more souls :-) |
23:04
🔗
|
ersi |
(I saw you in there as well now) |
23:06
🔗
|
ersi |
To fill in a little about what this particular projected needed when it was running was; IPs to spread out the requests on. URL shorteners are very restrictive against scrapers. |
23:06
🔗
|
ersi |
And discovering new url shorteners and map how their url generating schema is setup |