Running each pod on a separate EKS node (with node groups)

Hey everyone,

I’m using K6 operator and I’m trying to run a distributed load test on EKS with AWS auto-scaling node groups.

I’ve setup cluster autoscaler and it does autoscale my nodes if I set something like this in my K6 CRD:

        cpu: 600m
        memory: 1Gi
        cpu: 100m
        memory: 1Gi

However, my goal is to have every pod during my load test be run on a separate node (which means I’d like my node group to autoscale to the size of parallelism in the CRD).

I tried doing this with anti-affinity groups and couldn’t get it to work.

I then tried using separate: true as outlined in the operator documentation and I’m getting some strange behaviour.

For example, if I have parallelism set to 10, and I have a node group on EKS with a minimum of 2 and a maximum of 10, when I set separate:true it will create one additional node and give me 3 nodes. And all the other pods will remain in a “pending” state.

If I try to cancel and run it again, the same will happen. This time I’ll get one extra node which will give me a total of 4 nodes and all the other pods will remain in a pending state.

Any idea why this is happening. Would appreciate any help.

Here’s my CRD file:

kind: K6
  name: k6-sample
    app: load
  parallelism: 10
      name: "crocodile-stress-test"
      file: "test.js"
  separate: true
  arguments: --out statsd
        app: load
    # resources:
    #   limits:
    #     cpu: 600m
    #     memory: 1Gi
    #   requests:
    #     cpu: 100m
    #     memory: 1Gi
      - name: K6_STATSD_ADDR
        value: "statsd-service:8125"
    # affinity:
    #   podAntiAffinity:
    #     requiredDuringSchedulingIgnoredDuringExecution:
    #       - labelSelector:
    #           matchExpressions:
    #             - key: app
    #               operator: In
    #               values:
    #                 - load
    #         topologyKey:

Hi @elguaposalsero,
Welcome to the forum :wave:

It sounds like there’s an issue with EKS or autoscaler setup. separate: true should have been enough to allocate additional nodes in the scenario you described. I’d recommend to try to find out if cluster-autoscaler is healthy and what reason exactly is given for FailedScheduling:

Checking if there’s are any known issues related to specific versions of EKS and cluster-autoscaler might also help.

Hope that helps!

1 Like