#newsgrabber 2017-11-14,Tue

Logs of this channel are not protected. You can protect them by a password.

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)


WhoWhatWhen
***gazpaxo has quit IRC (Quit: Leaving) [00:10]
.... (idle for 18mn)
blitzed has quit IRC (Quit: Leaving) [00:28]
..... (idle for 23mn)
Aerochrom has quit IRC (Quit: Leaving) [00:51]
...................... (idle for 1h46mn)
arkiverhow is this going? [02:37]
JensRexModerately well I'd say.
The dedupe CDN was a brilliant idea.
Could use some more grabbers though.
[02:46]
.............................................. (idle for 3h45mn)
HCross2arkiver: could we become default project again? [06:34]
....................................................................................... (idle for 7h10mn)
jrwrHCross2: hows the dedupe CDN doing [13:44]
HCross2Holding well, but not everyone is using it [13:45]
jrwrunderstandable
there is a API for Bunny if you want I can add it in for some stats on the dedupe box
bot*
let me know if the CDN can use any headers, I can send some with the dedupe backend
oh and this https://bunnycdn.docs.apiary.io/#reference/0//api/purge
so I can purge keys when they get updated on the backend
that would be better
[13:45]
HCross2Ooh, yep [13:48]
jrwrPM me a API key and ill make it so [13:48]
jrwr opens screen while its in the middle of a dedupe
NOPE
not doing that many requests to bunnycdn
that would kill it
[13:57]
HCross2I'd have poor Dejan onto me as fast as anything [14:06]
jrwrI sent a support email asking what do
I might be able to set headers for 24hr
[14:07]
HCross2Let Dejan know you're working with me, I'll prewarn him [14:10]
jrwrI sent a support email from email@jrwr.io
Forrest Fuqua
I did say I was from the newsgrabber project
I would /love/ to get a inside man with cloudflare for stuff like this
[14:11]
KazFilippo?
he drops in and out of #archiveteam (and -bs)
[14:16]
jrwrAh [14:19]
....... (idle for 34mn)
arkiver: you around, can I get a sanity check on my math https://hastebin.com/yiseponivi.pl $reqsec is coming up strange
1510671129 -- 38262654 -- 119611 -- 99
from that middle echo
[14:53]
........ (idle for 37mn)
JensRexDo we have some shady deal with bcdn? Isn't shoveling traffic over there what you're supposed to do otherwise? [15:30]
jrwrwe are doing a good bit of req/s [15:30]
220req/s [15:36]
HCross2JensRex: kinda yes. It's called "I'm friends with the owner" [15:39]
jrwrhehe
tell him to hop on to IRC :)
I like how their API is ran with ASP/IIS
[15:43]
HCross2We mock him mercilessly on discord for that [15:53]
jrwrYou have a discord, me want :)
We can find new and exciting ways to break shit
[15:56]
..... (idle for 20mn)
JensRexHCross2 / arkiver: If we're going to be default project again, the youtube-dl situation should be adressed first.
(or *-video jobs put on hold.
[16:17]
***blitzed has joined #newsgrabber [16:27]
HCross2jrwr: Dejan just messaged me.. "uhm.. do I respond to this guy" [16:40]
JensRexIt's a trap.
Get an axe.
[16:41]
jrwrlol [16:42]
..... (idle for 24mn)
HCross2jrwr: "i think it wouldn't be very practical to purge 400k urls"
jrwr: your plan would cause BunnyCDN to open 9.2million threads
[17:06]
jrwrHCross2: ask about expire times and headers [17:16]
HCross2jrwr: I can set expiry time. What did you want to know about headers? [17:20]
jrwroh if I could set expire time headers
Cache-Control: max-age=
[17:23]
.... (idle for 16mn)
HCross2https://usercontent.irccloud-cdn.com/file/4Hrxcnl6/Screenshot_20171114-173911.png
Like this?
[17:40]
...... (idle for 27mn)
jrwrYa
Set it to 8 hours
[18:07]
HCross2I can't
https://usercontent.irccloud-cdn.com/file/Jqpe4fm7/Screenshot_20171114-181014.png
[18:10]
jrwr3 hours it is
well... set it to a day
that should do
[18:12]
..... (idle for 24mn)
arkiverjrwr: I'll have a look [18:36]
jrwrits fine, I think its the API server returning strange data [18:36]
........... (idle for 52mn)
arkiver2000 req/s?
damn
[19:28]
jrwr'Magyar, Keith' <keith.magyar@alliedmotion.com>; Makai, Tim <tim.makai@alliedmotion.com>
damn copy paste
2299.1 r/s
I wonder if thats correct
[19:41]
arkiverme too [19:47]
jrwrits right from the API
just taking samples
[19:47]
arkiveritems/hour on the dashboard doesn't seem much higher [19:47]
jrwrinteresting [19:48]
arkivermaybe different servers communicate
causing a higher number of requests?
[19:49]
jrwrhrm
1510681614 -- 39302440 -- 1159397 -- 905
[19:50]
arkiverwhat are those numbers [19:50]
jrwr1510681614 -- 39302440 -- 1159397 -- 905
so the first number is timestamp then second is the number of requests the APi returned
1510682520 -- 39436566 -- 1293523 -- 906
so thats a snapshot
39436566 - 39302440 = 134126
906 seconds between requests
[19:50]
arkiverthat sounds about right [19:52]
jrwr148.0419426048565 [19:52]
arkiveryeah [19:52]
jrwrmy bot said [DEDUPE STATUS] 4 threads || Records Added: 565,253 || DB Size: 20.12GB || Load: 71% || [CDN] 1281.1 r/s 90% hit 15.2G BW || EU: London, UK: 90% EU: Frankfurt, DE: 7%
ill do some digging tomorrow
[19:53]
arkiverare you using https://hastebin.com/yiseponivi.pl [19:53]
jrwrhttps://hastebin.com/emimawozus.xml
full IRC bot
has that same code
[19:54]
arkiverok [19:54]
jrwrFOUND IT
$amountreq = $response["TotalRequestsServed"] - $lasthits;
$lastreq = $response["TotalRequestsServed"];
named it wrong
[19:54]
arkivergood
so at 148 req/s we dont seem to be much faster
it probably takes load off your server though
[19:55]
jrwrthe req it self was not a pain point, SSDB is damn fast
its all the writes slowing everything down
[19:56]
arkiveri see [19:56]
jrwrmillions and millions of keys
I think I have about 10-20 billion keys in there
[19:56]
arkiverthats quite some keys [19:57]
jrwrI have no way of telling really
but I can tell you its been adding a shitton
[19:57]
arkiver:)
i see that yeah
90% hit is pretty nice too
[19:58]
jrwrya
thats a stat we didn't have before
[20:00]
arkiverhigher than i expected [20:00]
jrwrtor uses more CPU then ssdb does
lol
tor is pegged at about 20%, ssdb is 15%
[20:02]
arkiverpretty go
good
arkiver is afk
[20:15]

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)