I have the following Helm values for my Loki deployment.
I am running in AWS EKS and have two nodes, an untainted node and a noExecute taint that I intend for my production apps node.
My helm loki installation keeps trying to schedule -write and -read pods into my untolerated node.
yaml
serviceAccount:
create: true
name: loki-s3-storage-policy
imagePullSecrets: []
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::1234567:role/AmazonEKS_EBS_CSI_DriverRole
automountServiceAccountToken: true
loki:
schemaConfig:
configs:
- from: "2024-04-01"
store: tsdb
object_store: s3
schema: v13
index:
prefix: loki_index_
period: 24h
storage_config:
aws:
region: eu-west-1
bucketnames: testing.api.logs.chunks
s3forcepathstyle: false
pattern_ingester:
enabled: true
limits_config:
allow_structured_metadata: true
volume_enabled: true
retention_period: 672h
querier:
max_concurrent: 4
storage:
type: s3
bucketNames:
chunks: testing.api.logs.chunks
ruler: testing.api.logs.ruler
admin: testing.api.logs.admin
s3:
endpoint: null
region: eu-west-1
secretAccessKey: null
accessKeyId: null
s3ForcePathStyle: false
insecure: false
deploymentMode: SimpleScalable
backend:
replicas: 1
read:
replicas: 1
write:
replicas: 1
minio:
enabled: false
nodeSelector:
nodeGroup: testing-node-group-t3-large
tolerations:
- key: "dedicated"
operator: "Equal"
value: "production"
effect: "NoExecute"
I ran the following command:
helm install --values .\values.yml loki grafana/loki -n loki
I also tried:
helm install loki grafana/loki-simple-scalable --set read.replicas=2 --set write.replicas=2 --set loki.auth_enabled=false --values .\values.yml -n loki
However, I constantly see the following warning in my pods:
bash
kubectl describe pod/loki-backend-0 -n loki
sql
Warning FailedScheduling 4m18s (x2 over 9m25s) default-scheduler
0/2 nodes are available: 1 Too many pods, 1 node(s) had untolerated
taint {dedicated: production}. preemption: 0/2 nodes are available: 1
No preemption victims found for incoming pod, 1 Preemption is not
helpful for scheduling
I added a taint to one of my nodes to allow scheduling only for production apps. My production app has the following toleration:
yaml
tolerations:
- key: "dedicated"
operator: "Equal"
value: "production"
effect: "NoExecute"
I’ve tried adding the same toleration to my Loki values (see the bottom of the values), but it’s not working. Preferably, I would like the Loki pods to schedule on the non-production node, but at this stage, I just need them to run.
Can someone help me with the following?
How can I add the toleration to my Loki deployment?
How can I ensure Loki pods only schedule on non-tainted nodes?