I have some APIs which I would to load-test. and I would like to learn the number of requests per second(RPS) that server can handle before it starts to throttle.
With K6’s VUs and scenarios, it is possible to adjust number of requests that gets sent to the API and I may put certain number of VUs with scenarios or stages; and after the completion of test, I would only get the aggregated values, like mean response time per group, percentage of failed requests etc,. but, i wouldn’t be able to know, the number of requests after which the response time went up or requests started to fail. How do i learn that optimal RPS limit?
If that’s any help, and if you have docker already installed, I have put together a whole k6 + Grafana + InfluxDB docker-based stack already. You’ll find it on my GitHub. Once it’s started, all the tooling will be available from localhost to send k6 data into influx, and you will be able to easily interpret data in a visual way.
Regarding the best strategy to find your bottlenecks, my first intuition would be to use the constant-arrival-rate executor, and perform some sort of binary search oriented investigation. Start with a number of request per second (the rate attribute of the constant-arrival-rate executor) that does match your most conservative expectation, and double that number until you start seeing performance degradation. When your API starts to throttle (error rate going up, timeouts, longer requests), reduce by 25%, if it gets better, add 25%, etc… until you find the sweet spot.
Finally, if you have internal telemetry available, you should be able to pick the time window you’ve performed the test at, and compare with the metrics from your API directly to find out your bottlenecks.