K6 cannot execute js scripts if the csv file is too large

sunnini · July 12, 2022, 9:46am

My local machine is 8c16G，if loaded JSON file to execute script, it consumed about 10 GB of memory. CPU load average is not very high.

sunnini · July 12, 2022, 9:58am

if did not load JSON file to execute script, it consumed about 0.5GB of memory, and the progress bar ===> moving normally.

mstoykov · July 12, 2022, 11:13am

This all seems to say that the system you are testing can not handle the 10k QPS you want it to.

if did not load JSON file to execute script, it consumed about 0.5GB of memory, and the progress bar ===> moving normally.

You haven’t specified this, but given the output it seems that you just configured it to run for 5 minutes with 100 VUs and no particular number of iterations - so it did just that.

As such the progress bar is following the time (5 minutes) and not the number of iterations k6 was told to do.

In practice that seems to have done a bit less iterations over all than the run with the JSON loading.

And again this is a lot less than 1/4th of the 10K QPS you want so even without the huge JSON you can see that the system under test can’t handle the required load. The exact thing I told you to test first before trying to run with 10m records in a JSON

I guess now it is time to figure out how to make your system faster, good luck and you can always come back for more k6 questions.

sunnini · July 13, 2022, 3:35am

Thanks a lot @mstoykov , under your guidance, I figured out the problem. Here again, thank you for your efforts.

The 10k QPS is not actually for the API I’m currently testing, it’s another API’s performance requirement. I’m just using this API to do experiments.

At last, I have one more question consult you, the solution that @PlayStay mentioned above about folks have split their datafiles, can you provide some examples for reference? Later, I’ll consider whether to optimize my k6 script.

mstoykov · July 13, 2022, 7:57am

At last, I have one more question consult you, the solution that @PlayStay mentioned above about folks have split their datafiles, can you provide some examples for reference? Later, I’ll consider whether to optimize my k6 script.

I would guess @PlayStay was referencing the second of the old workarounds, the one that is presented with multiple CSV files.
With this script:

import { SharedArray } from 'k6/data';
import {sleep } from "k6";

const dataFiles = [
  './data_1.json',
  './data_2.json',
  './data_3.json',
  './data_4.json',
  './data_5.json',
  './data_6.json',
  './data_7.json',
  './data_8.json',
  './data_9.json',
  './data_10.json',
];

let data;

if (__VU == 0) {
  // workaround to collect all files for the cloud execution or archives
  for (let i = 0; i < dataFiles.length; i++) {
      let dataFile = dataFiles[i];
      new SharedArray(dataFile, ()=>{ return JSON.parse(open(dataFile))});
  }
} else {
  let dataFile = dataFiles[__VU % dataFiles.length]
  data = new SharedArray(dataFile, ()=>{ return JSON.parse(open(dataFile))});
}

export default function () {
  const user = data[0];
  sleep(1)
}

I got <3GB of starting memory usage, but it still took a minute to start the test.

I would argue this would matter if you don’t have enough memory, but otherwise it just makes it so that you need to split the files and then use the correct one in each VU. Not exactly the hardest thing, but arguably not really something you need to do if it isn’t needed.

PlayStay · July 13, 2022, 4:40pm

Hi @sunnini here’s an example of a scenario I use.

export let options = {
    scenarios: {
        peak: {
            // peak scenario name
            executor: 'ramping-arrival-rate',
            startRate: 0,
            timeUnit: '1s',
            preAllocatedVUs: 50,
            maxVUs: 20000,
            stages: [
                { target: peak, duration: peak_ramp },
                { target: peak, duration: peak_sustain },
                { target: 0, duration: ramp_down },
            ],
            gracefulStop: after_peak_delay, // do not wait for iterations to finish in the end
            tags: { test_type: 'peak' }, // extra tags for the metrics generated by this scenario
            exec: 'peak_gate_rush', // the function this scenario will execute
        },

This produces the desired load profile and keeps throughput constant marvelously. there are some drawbacks when the system under test becomes latent which is very expensive on VU usage, but that’s another story.

here’s the example of how I solved my memory issues and it does not require all files to be loaded at script startup. As I described in the procedure above I create 25 datafile which are less than 25MB and contain 100K records. I append a number to the partitioned data files then select a file based on the mod operator. this way each time an iteration occurs in the k6 script only 1 file of manageable size is loaded. I forget why I chose 15000 but it works for me . In my prod environment we use the million records partitioned into 25 datafile (yes we load test in production at scale). for staging and non-prod we only use 1 datafile so I hard code the number selector to “0”, hence the case/switch statement.

switch (deploymentEnvironment) {
    case "prod":
        var count = Math.floor(Math.random() * 15000) % 25;
    case "staging":
        var count = 0;
}

const accounts = new SharedArray('accounts', function () {
            return JSON.parse(open(data_file + count)).key;
});

seed data example with 1 record.

{
  "key": [
    {
      "ACCOUNT_ID": "accountID",
    } ]
}

Hope this helps.

sunnini · July 22, 2022, 10:17am

Hello @mstoykov @PlayStay Sorry for the late reply, with your help, I solved the problem successfully. Thanks very much again.

Topic		Replies	Views
Is there a better solution for loading and parsing large files? Grafana k6	2	566	July 5, 2022
My k6 crashed after 14 h Grafana k6	8	311	September 13, 2022
PUT request data Grafana k6	1	508	July 1, 2022
Test.js file size is 900kb, Starter and Pods are not created due to timeout error OSS Support	7	789	December 15, 2024
K6 take too much time to load OSS Support	9	2849	December 11, 2019

K6 cannot execute js scripts if the csv file is too large

Related topics