[09:09] from the main channel [09:09] Have a userscript project I was told ye might like: https://gitorious.org/cguserscripts/unbitly [09:09] Hello all: looking to talk to someone in UrlTeam? :) [09:09] Online API keeps a cache of recent URLs, but I could come up with something cleaner if it could be integrated with UrlTeam's efforts, for a more permanent backup of cached URLs: https://cathalgarvey.pythonanywhere.com/unbitly/dump [09:09] It was designed as a privacy shiv, not an archivist solution, y'see. [09:10] I think this could be useful for discovery of more bitly urls [09:21] Hello all! [09:24] So, delighted to learn of UrlTeam: was pointed to it by someone after writing a bitly circumvention script [09:25] Wondering what, if anything, I can do to make the webcache end of the system more useful to the urlteam effort? [09:25] At present it's bitly only, but if I can convince a bunch of people to use it, it might be a nice constant means of harvesting real URLs without having to guess URLs (tipping off service to scrapers?) [09:26] And it's got a selling point for participants: it removes click-tracking so users not only assist urlteam, they get more privacy. :) [09:35] very tempting thought [09:37] If not a re-hack of my script, it's at least a thought for urlteam efforts: userscripts let you make cross-domain requests.. [09:37] ..so a userscript could be used to harvest real URLs and post them to an UrlTeam API trivially enough. [09:38] Of course it's best for the API to do resolution of that url itself, to prevent people spoofing, but it cuts down the keyspace for trusted urlteam scrapers at least. [09:48] right [10:00] Thinking of writing a script that scrapes all of Twitter's "most popular accounts" list for likely-to-be-big short urls.. [10:00] https://twitter.com/who_to_follow/interests [10:01] Few hundred accounts there, totally doable. Probably higher traffic than feasable using a dedicated account and API though [10:02] hm, sure [10:37] ..so is there any documentation out there on how to submit to UrlTeam? Is there an API somewhere I can use? [10:37] Likewise for fetching, though if the dumps are too large I can't host on pythonanywhere.. [10:42] http://urlte.am/ has torrent info and file lists [10:42] as well as data formats, etc.. [10:56] But there's no API I can direct a webapp to, at present? [10:56] i.e., users request stubs, webapp resolves them, then directs stubs and resolved URLs to an URLteam archive API? [11:14] the tracker is currently offline while we migrate it [11:14] let me grab the github url so you can see the interface [11:14] We tend to build stuff way faster than we document it [11:15] https://github.com/ArchiveTeam/tinyarchive [11:15] https://github.com/ArchiveTeam/tinyback [15:51] cathalgar: There's no API, no. Not currently at least :) [16:13] Hm [16:13] Cool, thanks [16:13] GTG, talk soon perhaps! :) [16:14] If you guys *do* want to scrape what URLs pop up on my webapp, you can get a database dump at cathalgarvey.pythonanywhere.com/unbitly/dump but it rolls over at ~1000 at present, dropping least recent URLs. [16:14] TTYL [16:45] neat