Metrics from k6 into influx are wrong (?)

Hey!

I’m running k6 through the k6 operator (sometimes with multiple pods).
I’ve noticed that the http_reqs metric reported into influxDB seems a bit off.

while I see an average or roughly 1.21k rps


query:

SELECT sum("value") /($__interval_ms / 1000) 
FROM "http_reqs" 
WHERE $timeFilter 
GROUP BY time($__interval), "job_name" 
fill(null)

another strange thing with this metric is when taking the mean() function - the result is 1

the metrics I gather from the gateway are as expected by scenario

this is the scenario I’m using (it is stringified so I could change scenarios without changing the js code):

{
  "soak": {
    "executor": "ramping-arrival-rate",
    "startRate": 0,
    "preAllocatedVUs":100,
    "timeUnit":"1s",
    "maxVUs":10000,
    "stages": [
      { "target": 2000, "duration": "30s" },
      { "target": 2000, "duration": "2m" }
    ]
  }
}

any ideas why is there such a discrepancy between the two?

thanks!

Are you using InfluxDB 1.x or 2.x. ?

IIRC it’s influx 1.8 (last versions before migration to 2 and flux)

IIRC it’s influx 1.8 (last versions before migration to 2 and flux)
any ideas? I’m really not sure whats up with that

Hi @nocgod,
I don’t think I fully understand what you expect here. For instance, it seems the first plot shows what you calculated from http_reqs; but in the second plot, how does the query look like exactly? If it’s a builtin function mean in Influx, IMHO, there might be some specifics in how it’s calculated.

the first plot shows http_reqs as reported by a k6 node into influxDB.
the query is:

SELECT sum("value") / ($__interval_ms / 1000) 
FROM "http_reqs" 
WHERE $timeFilter 
GROUP BY time($__interval), "job_name" 
fill(null)

I’m normalizing to an average over a second to smoothen the graph

the second plot show the number of requests received in my gateway solution per second

SELECT mean("rate.mean") 
FROM "XXXX__requests" 
WHERE ("app" = 'XXXX' AND "dc" =~ /^$dc$/ AND "env" =~ /.*$env$/ AND "server" =~ /^$host$/) AND $timeFilter 
GROUP BY time($__interval), "server" 
fill(linear)

In general, I have 2 instruments on my gateway and the seem to agree about the request rate as seen by the gateway. however, I’ve never been able to see “agreement” between the metrics reported by K6 agents and the gateway metrics.

just to ensure you that my “smoothening” doesn’t affect the general accuracy I’m also attaching 2 images:

SELECT sum("value") 
FROM "http_reqs" 
WHERE $timeFilter 
GROUP BY time(1s), "job_name" fill(null)

and

SELECT sum("value") /($__interval_ms / 1000) 
FROM "http_reqs" 
WHERE $timeFilter GROUP BY time($__interval), "job_name" 
fill(null)

while the gateway reported roughly 5.5-6K rps in total (2 nodes stacked)

so, what I’m really asking is why the difference between what k6 reports to influx and what the gateway really sees. The gateway is reporting the target rates as defined by the scenario.

Hi @nocgod,
sorry for the late reply. I try to help, but I need more data for a better understanding.

Can you post the rate reported from the k6 end of test summary, please? The test has a constant rate, so it should have an overall rate closer to 2k.

Have you tried to run k6 without the operator? Are you getting the same metrics?
Do you see the same results when you use multiple pods or just one?