Number of requests sent by K6 as per the test seems to be dropping in between during the total test duration

, ,

Setup:

I am currently running my k6 test in a container(4 cpu, 16Gig mem) which I ran by building the xk6 image using the Dockerfile mentioned xk6-output-influxdb/Dockerfile at main · grafana/xk6-output-influxdb · GitHub

I am collecting and sending the metrics to InfluxDb using xk6 and then plotting a graph of requests/sec in grafana.

Test file

I am running the following test against a server

import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  scenarios: {
    cs: {
      executor: 'constant-vus',
      vus: 2,
      duration: '60s',

      // stages: [
      //   { duration: '10s', target: 10 },
      //   { duration: '50s', target: 10 },
      //   { duration: '10s', target: 0 },
      // ]
    }
  }
};

export default function () {
  const url = 'http://ai-gateway-influx:5052/v2/code/completions'; // Replace with your API endpoint
  const payload = JSON.stringify({
    "project_path": "string",
    "project_id": 0,
    "current_file": {
      "file_name": "test",
      "language_identifier": "string",
      "content_above_cursor": "func hello_world(){\n\t",
      "content_below_cursor": "\n}"
    },
    "stream": true,
    "choices_count": 0,
    "context": [],
    "prompt_id": "code_suggestions/generations",
    "prompt_version": 2
  });

  const params = {
    headers: {
      'Content-Type': 'application/json',
    },
  };

  const res = http.post(url, payload, params);

  check(res, {
    'status is 200': (r) => r.status === 200,
    'response time < 500ms': (r) => r.timings.duration < 500,
  });

  const utcTimestamp = new Date().toISOString();
  const message = `{"pe_iteration":"${__ITER}", "pe_http_method": "${res.request.method}", "pe_http_url": "${res.request.url}", "pe_http_status": "${res.status}", "pe_response_duration": "${res.timings.duration}ms", "pe_utc_timestamp": "${utcTimestamp}"}` 

  console.log(JSON.parse(message))
  sleep(1);
}

Actual Results: When I run the tests, I can see that intermittently, the number of requests go down and then come back up again.

Expected Result: I would expect the load that is being generated by k6 to be constant.

You can see in the below logs that there is no log entry for 2025-01-21T01:15:56 but there is one for 2025-01-21T01:15:55 and another for 2025-01-21T01:15:57

You can also see this drop which is pretty obvious in the Grafana dashboard. In this case it dropped to 0 from 2rps but when I tried for more VUs like 20 it used to drop to say 7/8/other lower value and then back to 20.

In case you are wondering the query I used for grafana

from(bucket: "test")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r._measurement == "http_reqs")
  |> aggregateWindow(every: 1s, fn: count)

I also tried increasing the container capacity by increasing the CPU and memory but to no avail. I can still see requests dropping in between during the duration of the test.

Can you help me figure out why this is the case?

Hi @vishalpatel1587 :wave:

I believe the behavior you observe is to be expected. The good news is that I think there is an already existing alternative offering the behavior you aim for.

I think the reason why you see a varying throughput is because of the constant-vus executor your script is relying upon. The constant-vus executor ensures that during the execution of the test, k6 will maintain a given constant number of VUs online. VUs perform iterations as fast as possible, but are dependent on the actual code the iteration consists of, and the system/network conditions they’re executed under.

Thus, when using the constant-vus executor and instructing it to use 20VUs, one is instructing k6 to spawn 20 VUs and have them execute the test function as quick as possible. The only guarantees being that 20 VUs will be out there doing so. That does not provide any guarantees regarding the throughput.

It looks like your goal is to simulate a given throughput. If that’s effectively the case, and as I understand it, you aim to express “my service receives 20 req/s for 2h”, then I suggest you take a look at using the constant-arrival-rate executor instead.

Let me know if I can help further :bowing_man:

@oleiade Thanks for your explanation. You’re correct in assuming that my goal is to simulate a given throughput and have a constant load for a required amount of time(1minute in my case).

I’ll have a look at constant-arrival-rate executor for my usecase. Thanks