What does an "inverted spike" in "Requests per Second" mean?

joslat · March 2, 2021, 4:52pm

Hi, got my first test up and running and so far it’s good but I am seeing a strange behavior, sometimes the requests stop for a little while and resume next. which corresponds to the checks… also the http_req_duration max times is 16.85s…

Here is a peek of what I see:

So, either the server is unresponsive and all the VUs are blocked waiting… or somehow the requests stop emitting…

What would be a good way to figure this out? probably this is not your first time seeing this pattern

Thanks in advance

Tom_1 · March 2, 2021, 9:26pm

Hello!

To determine the cause, you will probably want to have a look at http_req_duration on the time series as well: chances are response times are higher during those dips, which would explain why the request rate drops (the VUs execute synchronously, so when they’re waiting on a response, they’re not sending any further requests). A drop like the one in your chart seems to suggest it is affecting all VUs. It’s probably also worth checking the more granular metrics (http_req_sending, http_req_waiting, http_req_receiving).

To rule out any problems on the load generators, it is probably worth checking out CPU usage during those spikes. If the CPU is maxing out, that could also explain why your request rate drops. This only applies if you’re not seeing increased response times though (this straight away implicates either the network or something on the server-side).

Do you see any unexpected status codes?

joslat · March 17, 2021, 12:58pm

Thanks Tom!
Took me a while to get a conclusion with all the tests… but finally, it was simply that there is an issue with the system under test, so the cluster becomes unresponsive for some seconds and requests are not even received… and put “on hold”
So, K6 is simply waiting for the VU’s to get a response and thus the visualization…

In regards of the issue, seems a nginx-ingress kubernetes related which is directly corelated to a pod upscaling, but no root cause is clear yet…

Managed to execute the test with fixed pods (no upscaling happening) and everything worked fine, so this ruled out any possible issue with the local k6 execution. I also added a regular probe to other systems, to assess that it was working fine. It was.

Also posted the issue at the Kubernetes community, here.

And K6 stayed performant and operative, like a champion

No status codes though…

mstoykov · March 18, 2021, 8:39am

Hi @joslat,

If the old pods are being dropped it could be that k6 is still trying to use them due to DNS caching. It doesn’t look like it as k6 by default caches for 5 minutes and you seem to have 2 dips in less time. But you can set the ttl to 0 in the k6 config and see if that helps.

joslat · March 18, 2021, 1:17pm

Hi @mstoykov, thanks for the insights!!
The old pods are still alive, no recycling/restarting is happening… seems all of them are suddenly not responsive for a reason we have to figure out… at the moment trying to isolate the cause in the NGINX ingress controller…
For sure the issue is outside k6, as if we warm-up the pods there happens no autoscaling as all are already operating. Then the test runs flawlessly.

Topic		Replies	Views
K6's http_req metric undercounting actual RPS OSS Support	4	1312	August 26, 2020
Test Stuck After Time Limit Finished OSS Support	2	423	September 22, 2022
K6 and the k6 reporter do not log accurately highly concurrent requests OSS Support	3	539	January 3, 2023
Unexplained drop in load ramp up in the ramping-arrival-rate executor OSS Support	1	696	March 13, 2023
Metrics from k6 into influx are wrong (?) OSS Support	6	440	November 26, 2022

What does an "inverted spike" in "Requests per Second" mean?

Related topics