I am running into a weird issue with node affinity separate flag, and parallelism.
Whenever I set the nodeselector value for my pods the pods are stuck in a pending state claiming that the node affinity is not being met…
If set the parallelism value to 1
with separate
flag set to true
+ nodeselector everything works fine.
If set the parallelism value to more than 1
with separate
flag set to false
+ nodeselector everything works fine.
If set the parallelism value to more than 1
with separate
flag set to true
without a nodeselector everything works fine.
However, if set the parallelism value to more than 1
with separate
flag set to true
+ nodeselector the starter never executes and all the pods are stuck in pending.
initializer:
nodeselector:
node: k6
starter:
nodeselector:
node: k6
runner:
nodeselector:
node: k6
Here is the one of the pods defs
apiVersion: v1
kind: Pod
metadata:
annotations:
cloud.google.com/cluster_autoscaler_unhelpable_since: 2023-08-12T15:01:06+0000
cloud.google.com/cluster_autoscaler_unhelpable_until: Inf
creationTimestamp: "2023-08-12T15:01:03Z"
finalizers:
- batch.kubernetes.io/job-tracking
generateName: timestamp-2-
labels:
app: k6
batch.kubernetes.io/controller-uid: 99b6f5e8-35a5-49b7-b842-9853d67e2061
batch.kubernetes.io/job-name: timestamp-2
controller-uid: 99b6f5e8-35a5-49b7-b842-9853d67e2061
job-name: timestamp-2
k6_cr: timestamp
runner: "true"
name: timestamp-2-9c5zf
namespace: k6
ownerReferences:
- apiVersion: batch/v1
blockOwnerDeletion: true
controller: true
kind: Job
name: timestamp-2
uid: 99b6f5e8-35a5-49b7-b842-9853d67e2061
resourceVersion: "141199"
uid: 6cfe5a60-de09-4d3b-b2bf-6800c45c3eb0
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- k6
- key: runner
operator: In
values:
- "true"
topologyKey: kubernetes.io/hostname
automountServiceAccountToken: true
containers:
- command:
- k6
- run
- --execution-segment=1/4:2/4
- --execution-segment-sequence=0,1/4,2/4,3/4,1
- --out
- experimental-prometheus-rw
- --tag
- testid=tyk-timestamp-keyless-6aUq31pW8f
- /test/timestamp.js
- --address=0.0.0.0:6565
- --paused
- --tag
- instance_id=2
- --tag
- job_name=timestamp-2
env:
- name: K6_PROMETHEUS_RW_SERVER_URL
value: http://prometheus-server.dependencies.svc:80/api/v1/write
- name: K6_PROMETHEUS_RW_PUSH_INTERVAL
value: 1s
- name: K6_PROMETHEUS_RW_TREND_AS_NATIVE_HISTOGRAM
value: "true"
image: ghcr.io/grafana/operator:latest-runner
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /v1/status
port: 6565
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: k6
ports:
- containerPort: 6565
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /v1/status
port: 6565
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /test
name: k6-test-volume
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-dqmj2
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
hostname: timestamp-2
nodeSelector:
node: k6
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Never
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 0
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- configMap:
defaultMode: 420
name: timestamp-tyk-configmap
name: k6-test-volume
- name: kube-api-access-dqmj2
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2023-08-12T15:01:03Z"
message: '0/5 nodes are available: 1 node(s) didn''t match pod anti-affinity rules,
4 node(s) didn''t match Pod''s node affinity/selector. preemption: 0/5 nodes
are available: 1 No preemption victims found for incoming pod, 4 Preemption
is not helpful for scheduling..'
reason: Unschedulable
status: "False"
type: PodScheduled
phase: Pending
qosClass: BestEffort
Any ideas?