#urlteam 2017-09-13,Wed

↑back Search

Time Nickname Message
01:23 🔗 hook54321 http://budurl.com/uukc
01:39 🔗 treyo has joined #urlteam
01:45 🔗 HCross has quit IRC (Read error: Connection reset by peer)
01:45 🔗 HCross has joined #urlteam
01:50 🔗 treyo has quit IRC (Quit: Page closed)
04:00 🔗 Somebody2 Sorry I've been out-of-contact for some time. Nice to see the go.usa.gov folks (treyo) speaking up, and being interested in archiving.
04:00 🔗 Somebody2 I wondered how long it would be before they noticed how hard we were banging on their server.
04:01 🔗 Somebody2 Re: x.vu -- I'll set it running for now; we might as well get what we can.
04:16 🔗 Somebody2 x-vu started
04:18 🔗 Somebody2 gg-gg hasn't gotten results lately; turning it off
04:19 🔗 Somebody2 go-usa-gov is still working fine; I'll keep it going till treyo (or someone else) actually asks us to turn it off, or provides a better alternative.
04:20 🔗 Somebody2 vgd_6 hasn't found anything recently, but there are only a million total results right now, so it's probably still fine.
04:21 🔗 Somebody2 x-vu has gotten some results
04:21 🔗 Somebody2 specifically, about 800 results so far
04:22 🔗 Somebody2 x-vu returns HTTP 410 sometimes; added that as a no-redirect expected result
04:23 🔗 Somebody2 boosting the queue to 30
04:24 🔗 Somebody2 Hm, I'm not sure if it's case sensitive or not
04:24 🔗 Somebody2 the two character ones do not seem to be
04:25 🔗 Somebody2 finished all of them, in any case
04:26 🔗 Somebody2 boosting queue to 60
04:29 🔗 Somebody2 Interestingly, all of these seem to redirect to a warning page on xdotvu.com
04:29 🔗 Somebody2 but the URL includes the real target, so it's good enough for our purposes
04:32 🔗 Somebody2 getting a few timeouts, but ... it's going away soon. So, queue up to 90
04:34 🔗 Somebody2 we seem to have about 90 total warriors right now; so I'll boost the queue to 100, that way everybody can join in
04:42 🔗 Somebody2 Finished the initial-digit-three-character ones.
04:42 🔗 Somebody2 Hm, bunch of errors, dropping queue down to 90
05:09 🔗 Somebody2 checked over 100,000; nearly 4,000 found.
05:46 🔗 Somebody2 cleared out the errors, now reloading the queue with 60
05:54 🔗 Somebody2 errors came back; draining queue, then will try 40
06:02 🔗 Somebody2 40 seems to work, trying 50
08:20 🔗 dashcloud has quit IRC (Read error: Operation timed out)
08:20 🔗 dashcloud has joined #urlteam
09:48 🔗 JAA Somebody2: As far as I can tell, the paid short (less than 3 characters) codes are case-insensitive, but the automatic six-character chodes are case-sensitive. No idea about the ones you can set yourself in the advanced options.
10:10 🔗 JAA "terroroftinytown.client.errors.ScraperError: Number of attempts exceeded for 5708844300 (0-DHxu)." Hmm.
10:18 🔗 T31M has joined #urlteam
10:20 🔗 T31M has quit IRC (Leaving)
14:05 🔗 zhongfu_ has joined #urlteam
14:05 🔗 zhongfu has quit IRC (Ping timeout: 260 seconds)
15:19 🔗 Jonison has joined #urlteam
15:27 🔗 Jonison has quit IRC (Ping timeout: 260 seconds)
15:40 🔗 Somebody2 I turned *off* gg-gg...
15:42 🔗 Somebody2 x-vu has finished the 3-character ones
15:43 🔗 Somebody2 and it looks like go-usa-gov has blocked us
15:46 🔗 Somebody2 I'm trying to drain the queue on it, then we'll leave it off for a few days
15:47 🔗 astrid assholes
15:47 🔗 astrid :P
15:47 🔗 Somebody2 Eh, go-usa-gov is fine; they politely came in a day or so ago, and asked about setting up a bulk export instead.
15:48 🔗 Somebody2 I just thought I'd keep the scraper running till they actually told us to stop.
15:48 🔗 astrid yeah
15:51 🔗 Somebody2 eh, it doesn't seem to be draining; I'll just reset the autoqueue back to the last result, and clear it
15:51 🔗 Somebody2 that way it won't interfere with the other jobs
15:52 🔗 Somebody2 ok, going afk for the day
15:52 🔗 astrid toodleoo
20:26 🔗 svchfoo1 has quit IRC (Remote host closed the connection)
20:26 🔗 svchfoo3 has quit IRC (Remote host closed the connection)
20:27 🔗 svchfoo3 has joined #urlteam
20:27 🔗 svchfoo1 has joined #urlteam
20:29 🔗 JAA sets mode: +o svchfoo1
20:30 🔗 svchfoo1 sets mode: +o svchfoo3
20:39 🔗 astrid sets mode: +ooo joepie91_ HCross2 HCross
20:52 🔗 svchfoo3 has quit IRC (Remote host closed the connection)
20:53 🔗 Aoede has left WeeChat 1.9
20:53 🔗 svchfoo3 has joined #urlteam
20:54 🔗 svchfoo1 sets mode: +o svchfoo3
