Grafana Loki Helm Installation, Pending Pods with: 0/2 nodes are available: 1 Too many pods, 1 node(s) had untolerated taint

harrytalkstech · November 1, 2024, 3:34am

I have the following Helm values for my Loki deployment.
I am running in AWS EKS and have two nodes, an untainted node and a noExecute taint that I intend for my production apps node.

My helm loki installation keeps trying to schedule -write and -read pods into my untolerated node.

yaml


serviceAccount:
  create: true
  name: loki-s3-storage-policy
  imagePullSecrets: []
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::1234567:role/AmazonEKS_EBS_CSI_DriverRole
  automountServiceAccountToken: true

loki:
  schemaConfig:
    configs:
      - from: "2024-04-01"
        store: tsdb
        object_store: s3
        schema: v13
        index:
          prefix: loki_index_
          period: 24h
  storage_config:
    aws:
      region: eu-west-1
      bucketnames: testing.api.logs.chunks
      s3forcepathstyle: false
  pattern_ingester:
    enabled: true
  limits_config:
    allow_structured_metadata: true
    volume_enabled: true
    retention_period: 672h
  querier:
    max_concurrent: 4

  storage:
    type: s3
    bucketNames:
      chunks: testing.api.logs.chunks
      ruler: testing.api.logs.ruler
      admin: testing.api.logs.admin
    s3:
      endpoint: null
      region: eu-west-1
      secretAccessKey: null
      accessKeyId: null
      s3ForcePathStyle: false
      insecure: false

deploymentMode: SimpleScalable

backend:
  replicas: 1
read:
  replicas: 1
write:
  replicas: 1

minio:
  enabled: false

nodeSelector:
  nodeGroup: testing-node-group-t3-large

tolerations:
  - key: "dedicated"
    operator: "Equal"
    value: "production"
    effect: "NoExecute"

I ran the following command:

helm install --values .\values.yml loki grafana/loki -n loki

I also tried:

helm install loki grafana/loki-simple-scalable --set read.replicas=2 --set write.replicas=2 --set loki.auth_enabled=false --values .\values.yml -n loki
However, I constantly see the following warning in my pods:

bash

kubectl describe pod/loki-backend-0 -n loki
sql

Warning FailedScheduling 4m18s (x2 over 9m25s) default-scheduler
0/2 nodes are available: 1 Too many pods, 1 node(s) had untolerated
taint {dedicated: production}. preemption: 0/2 nodes are available: 1
No preemption victims found for incoming pod, 1 Preemption is not
helpful for scheduling

I added a taint to one of my nodes to allow scheduling only for production apps. My production app has the following toleration:

yaml

tolerations:
  - key: "dedicated"
    operator: "Equal"
    value: "production"
    effect: "NoExecute"

I’ve tried adding the same toleration to my Loki values (see the bottom of the values), but it’s not working. Preferably, I would like the Loki pods to schedule on the non-production node, but at this stage, I just need them to run.

Can someone help me with the following?

How can I add the toleration to my Loki deployment?
How can I ensure Loki pods only schedule on non-tainted nodes?

jangaraj · November 1, 2024, 7:44am

One node has taint and the second one reached pod limit and cannot start more pods. This is not a problem with your Loki config, but with your EKS. You don’t have resources for more pods. Ask your EKS admin for more resources, so you can run more pods.

Topic		Replies	Views
Helm installation with persisistent storage does not bind storage Grafana Loki	6	5927	April 10, 2022
Loki Helm installation failure in eks Anywhere cluster Grafana Loki loki	1	274	March 18, 2024
Loki (v3.1.1) SimpleScalable Setup with Helm - Retention Grafana Loki loki , aws	1	142	September 11, 2024
Grafana loki helm chart to try out the monolithic setup Grafana Loki loki	2	788	November 12, 2024
Hack_loki_operator step-by-step guide does not work Grafana Loki loki	3	454	November 29, 2024

Grafana Loki Helm Installation, Pending Pods with: 0/2 nodes are available: 1 Too many pods, 1 node(s) had untolerated taint

Related topics