Optimizing memory usage when running scenarios needing to POST different large JSON payloads

Hi,

I have been struggling with getting the memory usage across my tests down to something manageable. I have a limitation currently of 4 GB where k6 is being executed. On a local system I have tested things and when I run all my scenarios the memory usage spikes to ~9.5 GB.

The test involves sending a POST request to an endpoint – the data (JSON) can vary in size so I have multiple flavors of the payload being sent at different rates by different scenarios to mimic some production environment. From the reading I have done it seems like using a SharedArray would be the approach to go with. Currently I have something like the following defined:

module.exports.payloads = new k6Data.SharedArray('payloads', function () {
    const _payloads =  [
        open('./20.0KB.json'),  // 50 RPS, 20 VU for constant-arrival-rate scenario
        open('./100.0KB.json'), // 0.03 RPS, 1 VU for constant-arrival-rate scenario
        open('./250.0KB.json'), // 0.03 RPS, 1 VU for constant-arrival-rate scenario
        open('./500.0KB.json'), // 0.02 RPS, 1 VU for constant-arrival-rate scenario
        open('./750.0KB.json'), // 0.02 RPS, 1 VU for constant-arrival-rate scenario
        open('./1.0MB.json'),   // 0.02 RPS, 1 VU for constant-arrival-rate scenario
        open('./1.5MB.json'),   // 0.02 RPS, 1 VU for constant-arrival-rate scenario
        open('./2.5MB.json'),   // 0.02 RPS, 1 VU for constant-arrival-rate scenario
        open('./3.5MB.json'),   // 0.02 RPS, 1 VU for constant-arrival-rate scenario
        open('./4.5MB.json'),   // 0.02 RPS, 1 VU for constant-arrival-rate scenario
        open('./5.5MB.json'),   // 0.02 RPS, 1 VU for constant-arrival-rate scenario
        open('./6.0MB.json'),   // 0.02 RPS, 1 VU for constant-arrival-rate scenario
        open('./7.0MB.json'),   // 0.02 RPS, 1 VU for constant-arrival-rate scenario
        open('./8.0MB.json'),   // 0.02 RPS, 1 VU for constant-arrival-rate scenario
        open('./9.0MB.json'),   // 0.02 RPS, 1 VU for constant-arrival-rate scenario
        open('./10.0MB.json'),  // 0.02 RPS, 1 VU for constant-arrival-rate scenario
        open('./12.5MB.json'),  // 0.02 RPS, 1 VU for constant-arrival-rate scenario
        open('./17.5MB.json'),  // 0.02 RPS, 1 VU for constant-arrival-rate scenario
        open('./20.0MB.json'),  // 0.02 RPS, 1 VU for constant-arrival-rate scenario
        open('./25.0MB.json'),  // 0.02 RPS, 1 VU for constant-arrival-rate scenario
        open('./30.0MB.json'),  // 0.02 RPS, 1 VU for constant-arrival-rate scenario
    ];

    return _payloads;
  });

/*
Scenarios for each payload look something like:
module.exports.options = {
    scenarios: {
        "20KB_constant": {
            executor: 'constant-arrival-rate',
            rate: 50,
            timeUnit: '1s',
            duration: '15m',
            preAllocatedVUs: 20,
            env: { PAYLOAD: "0" },
            exec: 'MyTest',
        },
        ⋮
        "500KB_constant": {
            executor: 'constant-arrival-rate',
            rate: 1,
            timeUnit: '50s',
            duration: '15m',
            preAllocatedVUs: 1,
            env: { PAYLOAD: "3" },
            exec: 'MyTest',
        },
        ⋮
    },
};
*/

module.exports.MyTest= function() {
    const _payload = Number(__ENV.PAYLOAD);

    const res = http.post(
        endpoint,
        constants.payloads[_payload],
        auth_params
    );
}

I am little confused to be honest why the memory usage spikes to 9+ GB. From reading the docs the SharedArray itself shouldn’t take up much memory – it is only when an element is accessed from it that extra space is made to give a copy of the element to the requester. However, given that the larger payloads are only being sent once every 50 seconds for most (once every 30 seconds for a couple of the smaller large ones). I would have thought that with it only taking extra space when accessed, and given the larger payloads are accessed much less frequently that the memory usage wouldn’t be so high. However, it almost seems like every VU gets a complete copy of the SharedArray – maybe I have just misunderstood the purpose of the SharedArray in that case.

The docs do mention:

You can have multiple SharedArrays and even load only some of them for given VUs, though this is unlikely to have any performance benefit.

Although I am unsure if that would help in my situation, and also there is no explanation on how to implement this. Also, for extra context I am running under base compatibility mode – while that has certainly cut down my init time it has not helped with memory usage.

./k6 --compatibility-mode=base --log-output=none run ./my_test.js

Are there any improvements I can make, or a different approach I could take to lower memory usage?

I’ve gotten it down to 5.9 GB (max peak) by wrapping each open(...) call with JSON.parse(). Not sure how this is working (wouldn’t an object in memory take up more space?), if anyone has an explanation I’d love to know. In any case, during the test I just end up calling JSON.stringify(). Still looking to trim another 2.5 GB at least somehow.

I got the max peak usage for all scenarios down to 4.3GB max peak. I think I may just be pushing the limits of the tool currently. I squeaked out some more memory savings by aggressively cleaning up variables within each iteration. If I leave out one of the larger scenarios, then I can run most with a max memory usage peak under 4GB. This works for me right now – will probably end up looking into whether I can allocate more mem on the host container.

I noticed today that even loading a single 30MB JSON file causes the application’s memory usage to balloon to near 1GB. I also managed to get the memory usage down below 2.5 GB. I ended up using the open(...) function from the experimental fs library for payloads larger than 100KB. The smaller ones that get sent faster than once a second could benefit from being in memory (I just use the regular open(...) for those), but the larger once that get used much less frequently (once every 30 or 50 seconds) can just be read right before making a POST request. Anyways just wanted to follow up and close this out in case anyone else is having similar issues in the future and comes across this post.