[00:48] *** aaaaaaaa_ has joined #urlteam [00:48] *** aaaaaaaaa has quit IRC (Read error: Connection reset by peer) [00:48] *** swebb sets mode: +o aaaaaaaa_ [00:49] *** aaaaaaaa_ is now known as aaaaaaaaa [01:13] *** dashcloud has quit IRC (Remote host closed the connection) [01:14] *** aaaaaaaa_ has joined #urlteam [01:14] *** aaaaaaaaa has quit IRC (Read error: Connection reset by peer) [01:14] *** swebb sets mode: +o aaaaaaaa_ [01:18] *** aaaaaaaa_ has quit IRC (Read error: Connection reset by peer) [01:18] *** aaaaaaaa_ has joined #urlteam [01:18] *** swebb sets mode: +o aaaaaaaa_ [01:30] *** aaaaaaaaa has joined #urlteam [01:30] *** aaaaaaaa_ has quit IRC (Read error: Connection reset by peer) [01:30] *** swebb sets mode: +o aaaaaaaaa [01:57] *** Fletcher has joined #urlteam [02:35] *** svchfoo3 has joined #urlteam [02:35] *** svchfoo1 sets mode: +o svchfoo3 [03:42] *** VADemon has quit IRC (left4dead) [04:15] *** aaaaaaaaa has quit IRC (Leaving) [05:00] *** Ctrl-S has joined #urlteam [07:11] *** JesseW has joined #urlteam [07:24] *** JesseW has quit IRC (Leaving.) [08:38] *** zhongfu_ is now known as zhongfu [09:51] *** chazchaz has joined #urlteam [12:32] *** Smiley has quit IRC (Read error: Operation timed out) [12:32] *** Coderjoe_ has quit IRC (Read error: Operation timed out) [12:32] *** Coderjoe has joined #urlteam [12:34] *** HCross has quit IRC (Read error: Operation timed out) [12:41] *** Smiley has joined #urlteam [12:42] *** Silvan has joined #urlteam [12:43] *** dashcloud has joined #urlteam [12:43] *** svchfoo3 sets mode: +o dashcloud [12:45] *** SilSte has quit IRC (Ping timeout: 606 seconds) [12:49] *** xmc has quit IRC (Ping timeout: 606 seconds) [12:52] *** xmc has joined #urlteam [12:52] *** swebb sets mode: +o xmc [13:01] *** HCross has joined #urlteam [16:10] *** SimpBrain has quit IRC (Ping timeout: 615 seconds) [16:22] *** SimpBrain has joined #urlteam [17:03] *** zerkalo has quit IRC (Write error: Broken pipe) [17:04] *** phuzion has quit IRC (Read error: Operation timed out) [17:04] *** phuzion has joined #urlteam [17:04] *** Domin- has joined #urlteam [17:05] *** Domin_ has quit IRC (Read error: Operation timed out) [17:06] *** atlogbot has quit IRC (Read error: Operation timed out) [17:06] *** chazchaz has quit IRC (Read error: Operation timed out) [17:07] *** Coderjoe has quit IRC (Read error: Operation timed out) [17:07] *** svchfoo1 has quit IRC (Read error: Operation timed out) [17:08] *** zerkalo has joined #urlteam [17:11] *** SimpBrain has quit IRC (Read error: Operation timed out) [17:12] *** SimpBrain has joined #urlteam [17:18] *** Coderjoe has joined #urlteam [17:33] *** atlogbot has joined #urlteam [17:33] *** svchfoo1 has joined #urlteam [17:33] *** svchfoo3 sets mode: +o svchfoo1 [17:45] *** chazchaz has joined #urlteam [18:43] *** aaaaaaaaa has joined #urlteam [18:43] *** swebb sets mode: +o aaaaaaaaa [18:59] *** JesseW has joined #urlteam [19:33] *** bzc6p has joined #urlteam [19:33] *** swebb sets mode: +o bzc6p [19:34] So, in response to JesseW [19:35] As far as I know, the URLTeam v2 project is handled by chfoo. He keeps track of what has been done, and I think he is the one who regularly exports, or if it is automated, then he is who takes care of its regular operation. [19:36] AFAIK again, the URL shortener services are scraped regularly, once in a while. That is, if we scraped e.g. 1r.hu in Dec 2014, it will be scraped again in a year or two. [19:36] ah, I didn't realize they were intermittently turned on [19:37] I knew chfoo was the maintainer [19:37] Regarding adding new shorteners: I think chfoo regularly selects some new ones from the wiki list and inputs it into the tracker when there is not enough stuff to do. [19:38] I bet, because 1r.hu was added to the wiki by me, and I did nothing to make that be scraped by URLTeam. [19:38] ah, that makes sense [19:38] hm, I'll look in the wiki to see when you added it. [19:38] So if one wants to add new shorteners, they should just add it to the wiki, and they will be seen and added one day, I guess. [19:39] I'll also add a column to the table specifying when various ones were scraped, which should make it easier to see that there is progress [19:39] thanks for the answers, bzc6p! [19:40] Regarding urlte.am, it may be probably held by SketchCow. It indeed needs some update. [19:43] It seems like it would be better to just redirect it to the wiki [19:45] One should be very careful when poking SketchCow with "Please update this". [19:46] So I think I answered all the questions. If I were wrong, I'm gladly corrected. [19:46] It seems I added 1r.hu to the wiki in August 2014. [19:51] hm [19:51] yep, I think you answered all my questions. :-) [19:57] *** Smiley has quit IRC (Remote host closed the connection) [19:58] here's the distribution of dumps by project: http://0bin.net/paste/Zq93JzmZhY0YRAZ9#wtNYGOcDE+1rj5X29+pivT5XFCAO4quGKap1fW9pPQG [20:00] i.e. how many dumps contain data from each project. there are 10 that are only in one dump, 385 total dumps (I don't have a few donwloaded, because the IA's torrent-generator is borken) [20:00] isgd_6 is in 367 of the 385 dumps. [20:04] *** bzc6p_ has joined #urlteam [20:04] *** swebb sets mode: +o bzc6p_ [20:05] *** bzc6p has quit IRC (Read error: Operation timed out) [20:08] *** bzc6p_ has left [20:10] *** Smiley has joined #urlteam [20:40] *** VADemon has joined #urlteam [20:42] *** Start has quit IRC (Read error: Connection reset by peer) [20:43] *** Start has joined #urlteam [21:41] *** JesseW has quit IRC (Leaving.) [22:44] *** Atluxity has joined #urlteam [22:44] is the urlteam throtled to ~380 scans pr second? [23:04] *** JesseW has joined #urlteam [23:42] JesseW: do you know? [23:43] what? [23:43] is the urlteam throtled to ~380 scans pr second? [23:43] I don't know, sorry [23:43] ok, ty