K6-operator with Karpenter: Best practices and recommended configurations

mokopoi44 · July 6, 2025, 5:07am

Hi everyone,

I’m planning to use k6-operator with AWS Karpenter for auto-scaling our load testing infrastructure. Based on some GitHub issues I’ve seen (like the initialization timeout problems), I want to make sure I configure everything properly from the start.

My setup:

EKS cluster with Karpenter for node provisioning
Custom k6 image (~400MB) with xk6-browser extensions
Planning to run distributed load tests with multiple runner pods

Questions:

Node provisioning delays: Are there any specific Karpenter NodePool configurations recommended for k6 workloads? I’m concerned about the 60-second initialization timeout when new nodes need to be provisioned.
Pod annotations: I’ve seen mentions of karpenter.sh/do-not-disrupt annotation, but there seems to be a Helm chart issue (#477). What’s the current workaround for preventing node disruption during tests?
Image optimization: Any best practices for optimizing large k6 images with browser capabilities? Should I pre-pull images or use specific image pull policies?
Resource requests/limits: What CPU/memory requests work well with Karpenter’s provisioning decisions for k6 runners?
Startup probe configuration: Should I adjust probe settings to account for longer pod startup times on new nodes?

Has anyone successfully run k6-operator at scale with Karpenter? I’d appreciate any lessons learned or configuration examples.

Thanks!

olhayevtushenko · July 10, 2025, 11:49am

Hi @mokopoi44, welcome to the forum

We do not have any ready-made guide on Karpenter, sadly. But I’ve done some k6 testing with it and IMO, in general case, it’d require some tweaking to make for a smooth experience. However, this also heavily depends on the type of testing you’re going to have.

Node provisioning delays
I’m concerned about the 60-second initialization timeout when new nodes need to be provisioned.

Which timeout specifically do you mean here? If it’s about the node becoming ready, then in general case, it won’t matter for the k6-operator test. It’ll matter for the fact that you might have to wait longer for the test to even start. But the k6-operator doesn’t have any timeouts at the moment; it’ll wait indefinitely.

Pod annotations: I’ve seen mentions of karpenter.sh/do-not-disrupt annotation, but there seems to be a Helm chart issue (#477)

Karpenter disruptions are certainly an issue for the quality of k6 tests, so they should be prevented. But I don’t follow what the problem with the annotation is: could you link which issue you mean here?

Image optimization

We have an official image containing browser, tagged as -with-browser:
https://hub.docker.com/r/grafana/k6/tags

So unless there is a reason to use a custom image, I’d recommend to use an official k6 image. It’s a bit smaller too: closer to 300MB rather than 400MB.

As for pre-pulling: I think it depends on the type of testing. For example, if you’re going to run a large browser test, which requires lots of new nodes, every few hours, you might want to pre-pull images indeed. In comparison, if it’s once a week, perhaps, it’s fine to pull images as usual. How exactly to do that: such caching strategies are outside of k6 or k6-operator scope, so the general rules apply. You can pre-built node images, for example, or use additional tooling to pre-pull images. But it certainly makes sense to plan for the testing needs ahead.

As for imagePullPolicy: by default, it’s IfNotPresent as designated by Kubernetes. So it might be that you won’t even need to change it.

Resource requests/limits

Since we’re talking about browsers, this is an open question at the moment See this issue:

github.com/grafana/k6

Conduct performance testing of k6-browser

opened 11:10AM - 05 Jun 25 UTC

yorugac

feature triage

### Feature Description Currently, our understanding of computational resources… needed for browser tests is limited. More generally, this is also about improving knowledge of how browser (chromium) works underneath. So it makes sense to conduct specific tests targeting browser and analyze the results. The intro to this problem was best described by @ankur22 over here: https://github.com/grafana/k6/issues/4628#issuecomment-2824841056 ### Suggested Solution (optional) In general, it'd be good to understand the following questions better: - Is there a baseline resource consumption for the browser? If it exists, what are its values, and if it doesn't, what does it depend on? - How does resource consumption change with website complexity? And what is 'complexity' in that case? - How is resource consumption "split" between browser and k6 itself? E.g. in which cases browser will be overloaded and in which cases k6 will be overloaded. To do such testing, one needs not only k6 but also a suitable website that is complex enough to put some load and is also fully observable. So this issue is likely a prerequisite: - https://github.com/grafana/k6/issues/4828 ### Already existing or connected issues / PRs (optional) - https://github.com/grafana/k6/issues/4628 - https://github.com/grafana/k6/issues/4370 - https://github.com/grafana/k6/issues/4395 - https://github.com/grafana/k6/issues/4239

Briefly, browser resources usage heavily depends on your website. There are no general recommendations that will fit for any website and for any kind of test. k6-browser is, of course, significantly more heavy than plain k6. For example, see how much CPU and memory everyday Chromium takes, even when idle. Something similar happens with the k6 browser extension. I very much recommend to monitor your runners from the beginning and to tune resource usage accordingly.

Startup probe configuration

Do you mean setting startup probe for k6 runners? k6-operator doesn’t actually support that now… Only liveness and readiness probe can be set. On a good side, this means startup probe never came up as an issue for anyone So tweaking liveness and readiness probes, as needed, should be sufficient.

Hope that helps!

Topic		Replies	Views
Autoscale feature on k6-operator k6-operator	1	510	June 21, 2023
Running k6 chromium browser test in Kubernetes cluster using k6 operator k6-operator k6 , k6-browser , xk6	3	381	August 31, 2024
K6 K8s Operator Initializer not spawning runners OSS Support	5	1521	September 29, 2022
K6 runners fail on high rps OSS Support k6	6	317	October 28, 2024
Running each pod on a separate EKS node (with node groups) k6-operator	1	787	March 8, 2023

K6-operator with Karpenter: Best practices and recommended configurations

Related topics