Starter Pod is giving the following error: 'exec /bin/sh: argument list too long' when running with more than 650 pods

The Starter Pod is encountering the following error: ‘exec /bin/sh: argument list too long’ when attempting to run with more than 650 pods. We have a requirement to run with 2000 pods. The Initializer Pod is ready, and all 2000 pods are ready, but the Starter Pod is encountering an error. The operator has attempted to spin up other Starter Pods, but they all encounter errors as well. In our testing, we increased the number of pods in increments of 50, and at 700 pods, we encountered the same error. Do you have a temporary fix or a file we can adjust for the time being?

Hi @ppyneni !

I experimented with this issue, and here are my results:
The starter pod start a command like this:

sh -c 'curl --retry 3 -X PATCH -H ''Content-Type: application/json''
      -d ''{"data":{"attributes":{"paused":false,"stopped":false},"id":"default","type":"status"}}'';curl
      --retry 3 -X PATCH -H ''Content-Type: application/json''
      -d ''{"data":{"attributes":{"paused":false,"stopped":false},"id":"default","type":"status"}}'''

This is just for two runner pod. In your case, this command is much longer. For me it starts failing with argument list too long somewhere around 131070 characters long command line.

I don’t really see the workaround for this. Do you really need 2000 pods? why?

Thank you @bandorko. We have a use case to perform a load test with that many PODS. The workaround is going to involve multiple deployments, such as 4 * 500 = 2000.

Why do you need so many pods? You can perform large load with a single pod. I think you only benefit from the 2000 pods if you have 2000 kubernetes nodes.

We have a use case to validate the application’s readiness for the marketing event in the future with UI experience using k6 browser and operator. Resource-wise, we are good, and the test is complete with four deployments.