Share of date file by all Vu's

mstoykov · February 14, 2020, 9:10am

The code should’ve been (but isn’t because of technical limitation):

const maxVUs = 200;
var data;
if (typeof __VU === "undefined") { // there is an execution of the init context which is just so we know what files will be needed
  var p = open("data.json");
} else { // we have __VU
  data = function() {
    var rawData = JSON.parse(open("data.json")); // we read and parse data.json which is just a big array
    let partSize = Math.floor(rawData.length / maxVUs); // we get in how many steps we have to divide it so it is even (maybe use ceil instead of floor as this will possibly miss some values ... but with floor there will be overlap ...)
    return  rawData.slice(partSize*__VU, partSize*__VU+partSize); // we get only the parts for that VU
  }
}
// do stuff with data

Unfortunately … __VU is not defined in the init context even when we are actually in a VU which IMO is a bug but as previously stated there are other priorities currently and they will have effect on this so we will fix it when #1007 is merged :).
So we need to come up with some random number and this is what I propose

const maxVUs = 200;
// we don't check for __VU as it is never defined
var data = function() {
    var rawData = JSON.parse(open("data.json")); // we read and parse data.json which is just a big array
    let partSize = Math.floor(partSize.length / maxVUs); // we get in how many steps we have to divide it so it is even (maybe use ceil instead of floor as this will possibly miss some values ... but with floor there will be overlap ...)
    let __VU = Math.floor(Math.random() * maxVUs)  // just get a random VU number
    return  rawData.slice(partSize*__VU, partSize*__VU+partSize); // we get only the parts for that VU
  }
// do stuff with data

In both cases maxVUs needs to be defined by you as well and I would propose that given that only the second example currently works I would recommend that if you have 200 VUs on a machine to set maxVUs to something like 20 so every VU gets 1/20 of the raw data. Obviously in this case, maxVUs is … not correctly named so maybe rename it to dataParts?

If you are going to separate between 4 machines and if this is applicable you can also divide the data into 4 parts between the machines

Something that I didn’t mention as it usually less of a problem when you have big data arrays that need to be loaded is that from k6 v0.26.0 there is compatibility mode option for k6 which will disable some syntax and niceties but also lowers the memory usage … significantly, for scripts that don’t use that much data.

Our benchmarks show a considerable drop in memory usage - around 80% for simple scripts, and around 50% in the case of 2MB script with a lot of static data in it.

Hope this helps you

Topic		Replies	Views
Out of Memory with More Virtual users OSS Support	14	2938	June 19, 2023
When parameterizing data, how do I not use the same data more than once in a test? OSS Support	19	22307	October 3, 2023
K6 cannot execute js scripts if the csv file is too large Grafana k6	26	1612	July 22, 2022
Grafana K6 - New user - Memory issues reading from JSON and/or CSV when file size is large Grafana k6 k6	4	116	January 30, 2025
How to load JSON from a file per VU iteration OSS Support	3	1328	January 20, 2022

Share of date file by all Vu's

Related topics