[00:08] I think the myspace urls might be done for the moment we got some 500s [00:09] maybe not done but a lot of errors [00:11] JAA you have any idea what that error means? [00:11] Flashfire: Have an example URL? [00:12] There are the errors in the error reports on the tracker [00:13] I dont know what they mean [00:13] Yeah, but I don't have my tracker password handy. [00:14] I can change the password for you just DM me what you want it to be [00:14] Nah [00:15] Then I have to remember to update it in my password manager afterwards etc. [00:15] I can take a look in a bit. [00:21] ok [00:33] http://ur.ly/ [00:35] https://shorteners.net/ [00:37] Flashfire: I don't see any 500 errors on mysp-ac, only timeouts. [00:37] ok [00:37] And we are still finding results. [00:39] Oh wait, I was looking at the wrong page. [00:39] Yeah, not many results recently for mysp-ac. [00:40] Not sure what that means though. [00:40] https://gram.im/h2PLR1gg [00:40] New one I found [00:41] Captchas yuck [00:41] And 8-char codes are too long to bruteforce anyway. [00:41] http://tmearn.com/zIMfR [00:42] Connection reset for me for the url above [00:43] JAA what do you get? [00:43] http://oke.io/B5DqMWy [00:44] Flashfire: For tmearn.com? Works fine here. [00:44] Ok [00:44] https://lyonkim.net/T7vHyb [00:46] http://zlshorte.net/NkSKYuQI [00:46] https://adpop.me/yEaDc0pl [01:07] JAA I am not doing to bad with puri.na I had a hunch and a bit of proof and rolled with it so far found 40 urls that are different to the standard bitly ones [01:37] *** Zerote has quit IRC (Ping timeout: 260 seconds) [01:44] *** Jusque has quit IRC (Ping timeout: 600 seconds) [02:12] *** Jusque has joined #urlteam [02:46] *** ivan has quit IRC (Write error: Broken pipe) [02:47] *** kiska1 has quit IRC (Read error: Operation timed out) [02:47] *** Mayonaise has quit IRC (Read error: Operation timed out) [02:47] *** ivan has joined #urlteam [02:48] *** TigerbotH has quit IRC (Read error: Operation timed out) [02:48] *** Mayonaise has joined #urlteam [03:36] *** odemg has quit IRC (Ping timeout: 615 seconds) [03:41] chfoo: pushed a commit to the PR and commented [03:43] *** odemg has joined #urlteam [03:44] yeet [03:45] Flashfire: the party can begin i think [03:45] Excellent [03:45] i'll have to update the warrior script and tracker [03:45] huh [03:45] Alright then that shouldnt take too long though right? [03:45] chfoo: can you explain what needs to be updated? [03:45] *** kiska1 has joined #urlteam [03:46] update the submodule for warrior, then update the tracker so it shows the custom code warning [03:46] should take only a few minutes [03:46] oh are those things that cant be done with normal web ui admin access? [03:46] i mean he submodule update yeah [03:49] i'm trying to figure out how to streamline and document this process enough so that we can cut down on time from shortener discovery to full rollout of the custom code with all configurations set [03:50] *** TigerbotH has joined #urlteam [03:51] the tracker part is optional i think, but updating the submodule can be done by someone with access though [03:52] to the github repo, i.e. pull request [03:53] ok, done [03:53] oh, update the min version in the tracker so it doesn't try to use it without the custom code [03:54] What is the mind version chfoo? [03:55] 52 [03:59] Flashfire: oh and before you enable it [03:59] "get", not "head" [04:00] "head" will return you 405 Method Not Allowed [04:02] I am currently on an unstable internet and while I would love to get the party started my siblings are pestering me to go over to the pool and the internet keeps dropping out. So I will keep that in mind and start it soon. Unless you want to laugh like a maniac and flip the switch [04:03] laughing like a maniac and flipping the switch is my daily driver [04:04] chfoo: everything looks goood i think? [04:05] i think so [04:05] set min version to 52, http method to get and redir status codes to 302 because thats what google uses [04:07] omgididit [04:12] hmm i think something is b0rk [04:13] terroroftinytown.client.errors.UnhandledStatusCode: Unknown status code 301 for 'http://goo.gl/k' [04:13] looks like they are /actually/ using 301s [04:13] huh [04:14] Set up both of them as redirects? [04:14] yeah [04:14] lets see, this should work now [04:14] I got on a bit more stable connection only problem I’m on mobile now lol [04:14] and indeed it does [04:15] 170/170 found and scanned [04:15] Nice lol [04:17] oh god no [04:17] Fusl if you want to look at fascinating you can look at puri.na vs bit.ly both bitly URLs but over 40 short codes don’t match up so far [04:17] > http://goo.gl/{shortcode} [04:17] this shouldve been https [04:17] Oh fuck [04:17] hold on [04:18] http redirects to https [04:18] thats why we were seeing 100% found [04:18] Oh shit [04:18] I can’t find a box to turn it off🤔 [04:18] i did [04:19] thats also the reason we were seeing http 301 [04:19] because i knew 301s arent used [04:19] but i never tested the http->https redirect [04:20] there, all fixed now [04:30] Fusl I thought we agreed on a reduced queue to avoid a ban [04:32] Flashfire: you mean getting banned as a whole or per scraper? [04:32] both [04:33] i dont think they give many fugs about us as a whole, same as they didnt for G+ [04:33] and per scraper, 50 is so much below the threshold that triggers their rate limit [04:35] but tbh idc in the end, we can limit it back down to 10/5 [04:41] Flashfire: lmk if you want me to set it down to 10/5 [04:41] Nah it should be fine if you are confident [04:46] 04:17 <@Flashfire> Fusl if you want to look at fascinating you can look at puri.na vs bit.ly both bitly URLs but over 40 short codes don’t match up so far [04:46] can you elaborate on that? [04:47] bit.ly/1 vs puri.na/1 is the first example [04:47] yeah? they give different results [04:47] Exactly most low level bitly urls dont do that [04:47] they are both bitly urls [04:48] oh, as far as i know you can configure that [04:48] like, you can have a custom bitly domain and set it to resolve standard bitly stuff or dont [04:48] s/dont/not/ [04:49] or at least you used to be able to do that [05:00] soon arriving at 4 char urls so we'll see more than 0-1 found per batch [05:02] actually nevermind, i think were doung uppercase chars still [05:02] yeah [05:11] Fusl I am still keeping it at that low rate to avoid a ban [05:26] rtsoftware.systems added to tracker however I am not sure I trust this connection I am on enough to make a big edit to the wiki [05:32] finally arrived at the end of 3 char urls, for real this time [05:33] Hey Fusl if you want to add rtsoftware.systems to the URLteam warrior section that would be brilliant [05:34] "rtsoftware.systems". at this point url shorteners dont really make sense anymore LOL [05:35] I mean I agree but someone made it [05:48] added, hopefully correct syntax but seems ok [05:48] (i really hate mediawiki lol) [05:51] Yeah thats about right missing the pink bit but thats no huge biggie [05:53] FUSL [05:53] We getting rate limited [05:53] 500 [05:53] Or maybe not [05:53] a server error though [05:53] 500 are just server errors [05:53] everythings fine [05:53] Ok [05:54] Ill just add those back to the queue then [05:54] everything. is fine. http://xor.meo.ws/d5181c52/c797/4038/be34/3ced62acc345.png [05:54] I am returning the claims to the queue that encountered server errors [05:54] unless you want to edit the response codes for the 500s [05:55] nah, we shouldnt ignore 500s [05:55] and those return automatically in 30 minutes anyways i think [05:56] Yeah they do but to speed it up sometimes I just do the releasing claims earlier if they error [05:56] well technically we can set the auto release to 10 minutes [05:57] right now items are flying in after around 45 seconds [05:57] Fusl are you using IPV6? [05:57] nope [05:57] Check the error reports for bitly_6 [05:58] oof what the hell is that [05:58] I dont know [06:16] Fusl it doesnt look like the google thing is sorting itself out [06:18] Flashfire: it does do a great rate though [06:19] actually, it does not [06:19] huh [06:19] Did it all get blocked up by the 500s? [06:19] seems so [06:20] i think including 500 in the banned status field will cause the client to retry 10 times [06:20] so we might be able to use that [07:08] Flashfire: looks like 500 errors are more than just temporary errors [07:09] https://goo.gl/1jJ3 [07:09] this one for example throws a constant 500 error [07:10] so i'm thinking about adding 500 to the list of no redirect or unavailable status codes [07:10] chfoo? [07:19] Sorry I was collecting fire wood Fusl I am not sure what difference it would make which one you added it to as long as it was no redirect or unavailable [07:19] what were you doing to get rid of the errors before? [07:20] please dont tell me it was delete job [07:20] delete job [07:20] Fuck [07:20] nah jk, i clicked on delete error one each of em [07:21] that means anything in that bracket of 50 wasnt grabbed [07:21] well at least before that [07:21] i havent deleted any errors since we are in the 4 char group [07:21] just lowered the auto requeue from 30m to 10m [07:21] ok [07:22] alright [07:22] imma add 500 to unavailable [07:22] Ok [07:22] wait so only some of the 500s permanent? [07:22] Some were temporary? [07:23] all of them seem to be permanent [07:23] well [07:23] a very very very small percentage of them are temporary [07:23] whats the numbers? [07:23] ec [07:23] sec [07:23] 2 in 164 500s were temporary [07:24] hmm I think we can deal with those losses [07:25] chfoo: what do you think? i added 500 to unavailable status codes now so they are ignored. should we keep it like that or should we rather fail the items? [07:25] im gonna keep an eye on what the current 500s will do in a few hours, maybe they are temporary but just take longer to recover [07:32] so this, if im reading it correctly, should delete old errors every 5 mins https://github.com/ArchiveTeam/terroroftinytown/blob/1f5e121f85e734b71b06ebd69cf5131b8b8f51d3/terroroftinytown/tracker/app.py#L95-L111 [07:32] by calling this function https://github.com/ArchiveTeam/terroroftinytown/blob/1f5e121f85e734b71b06ebd69cf5131b8b8f51d3/terroroftinytown/tracker/model.py#L521 [07:32] which limits the delete to 1 error every 5 minutes [07:33] which seems to be correct since one error just disappeared from the dashboard [07:33] yay i can read code [07:42] i'm sorry i had to https://atdash.meo.ws/d/urlteam/urlteam-stats?from=now-30m [07:48] *** Zerote has joined #urlteam [08:18] terroroftinytown.client.errors.UnhandledStatusCode: Unknown status code 530 for 'https://is.gd/1CazbD' [08:18] what the fuck is a 530 [08:20] Oh fuck [08:20] Is.gd may have enabled cloudflare [08:20] FUSL CHFOO SOMEBODY2 JAA KAZ HOOK54321 [08:20] can any of you confirm this? [08:20] its behind CF but that url is giving a 404 here [08:21] no need to ping everyone though :) [08:21] > Error 1016 / Error 530 indicates Cloudflare is unable to send requests to your server because its origin IP cannot resolve the A or CNAME DNS record requested. [08:23] I am turning off autoqueue for the moment [08:23] I dont think it was behind cloud flare before [08:24] its been behind CF since 2017-05-26 [08:25] ok then [08:25] I am bad at research thats a given [08:25] but the error of 530 I have no idea what threw it [08:25] Or these IPV6 errors on bitly_6 [08:27] been trying to figure that out since you mentioned that earlier [08:27] but cant seem to reproduce this [08:29] No idea either cause I cant read code [08:46] yeah absolutely no idea whats happening there [09:24] *** seatsea has quit IRC (Read error: Connection reset by peer) [09:39] still no luck in reproducing this [09:40] im just giving up at this point [11:10] *** seatsea has joined #urlteam [11:17] *** seatsea has quit IRC (Quit: The Lounge - https://thelounge.github.io) [11:18] Re IPv6: I haven't looked at it, but could it be misconfigured machines which have IPv6 enabled and resolve AAAA records but can't route IPv6 traffic? [11:19] *** seatsea has joined #urlteam [11:26] JAA: bit.ly doesnt have ipv6 records [11:27] Right [11:28] What's the error? Can't see any on the tracker right now. [11:30] "ValueError: Invalid IPv6 URL" [11:32] in main.py fetch_url line 86, the call to `requests.head` [12:22] *** seatsea has quit IRC (Quit: The Lounge - https://thelounge.github.io) [14:13] *** seatsea has joined #urlteam [14:13] *** seatsea has quit IRC (Client Quit) [15:08] *** seatsea has joined #urlteam [15:14] < location: http://www.google.com/sorry/media/players/not/allowed/ [15:14] thats a new one lol [18:25] *** ivan has quit IRC (Read error: Operation timed out) [18:25] *** JAA has quit IRC (Read error: Operation timed out) [18:25] *** Mayonaise has quit IRC (Read error: Operation timed out) [18:25] *** ivan has joined #urlteam [18:25] *** Mayonaise has joined #urlteam [18:26] *** lunik1 has quit IRC (Read error: Operation timed out) [19:26] *** lunik1 has joined #urlteam [19:29] *** JAA has joined #urlteam [19:29] *** bakJAA sets mode: +o JAA [19:29] *** Fusl sets mode: +o JAA [20:25] is everything ok? [20:45] *** Zerote has quit IRC (Ping timeout: 260 seconds) [21:07] *** Zerote has joined #urlteam [22:19] *** Jens has quit IRC (Remote host closed the connection) [22:20] *** Jens has joined #urlteam