I am testing an API that performs complex computations on customer specific data sets interactively. I have designed a generic script that is able to emulate realistic and individualized user requests using a pseudo-randomized approach. Since requests are not pre-generated this script parses the full response body for each request. On top of that I am using our TypeScript Client SDK in the script to interact with the API. This is naturally a slightly more heavy process than just playing back requests, but on the other hand the primary benefit of K6 is that its programmable. The script sleeps for 1s after sending a request ( because its emulating interactive user behavior ) which I would have hoped would give plenty of time for the load generator to catch a breath.
The final transpiled script is ~1.2 MB. Response bodies can be on the order of 250-300kb (for this particular example, other data sets can go as high as a few MB).
When I run locally, my desktop can easily support for example 30 VUs (using ~10% CPU on a 16 core 5950x and 1.1gb RAM for the k6 process).
If I try the same script in the cloud the load generator only manages to send 1/3 (!) of the requests that my desktop PC sends (greatly skewing results), and I see complaints about CPU usage:
The first spike to 100% occurs already at 5 VUs and at 6+ VUs its stuck at 100% for the remainder I have tried running with the “base” JS compatibility mode, but it does not help.
I assume the problem is that k6 is assuming much more light weight VUs and are allocating too little CPU (and memory)? Is there any way to change this?
If not, are there any things I should try before using our own hardware for load generation instead? Is the problem most like to be the script size, the response size or the little bit of CPU usage for processing a response? (json deserialization seems to avg at 5ms in the cloud (0ms on my desktop), and the bit of logic I have is maybe 2ms).