Configuring Loki Alerting with Alertmanager in Kubernetes is not working as expected

I am using Loki Helm chart v5.10.0 (app version 2.8.3) to deploy Loki to Kubernetes

I would like to add an alert based on the logs, and followed the helpful blog found here https://dbadbadba.com/blog/finocchiaro-loki

My understanding is that the alerting rules that I define should only be applied to the loki-read pods.

I have setup RulerConfig as follows

        rulerConfig:
          storage:
            type: local
            local:
              directory: /var/loki/rules
          rule_path: "/var/loki/rules-temp"
          ring:
            kvstore:
              store: inmemory
          alertmanager_url: http://_http-web._tcp.alertmanager-operated.monitoring.svc.cluster.local:9093
          enable_alertmanager_v2: true
          enable_api: true

For loki-read I have added a volume mount, referring to a ConfigMap

      read:
        replicas: 1
        extraArgs:
          - -config.expand-env=true
        extraEnvFrom:
          - configMapRef:
                name: loki-s3-storage
        extraVolumeMounts:
          - name: rules
            mountPath: /var/loki/rules/fake
        extraVolumes:
          - name: rules
            configMap:
              name: loki-alerting-rules

I have a ConfigMap which contains the rules yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: loki-alerting-rules
data:
  rules.yaml: |
    groups:
      - name: test_alert
        rules:
          - alert: AlertManagerHealthCheck
            expr: 1 + 1
            for: 1m
            labels:
                severity: critical
            annotations:
                summary: Not an alert! Just for checking AlertManager Pipeline
          - alert: TestAlert
            expr: sum(rate({app="loki"} | logfmt | level="info"[1m])) by (container) > 0
            for: 1m
            labels:
                severity: warning
            annotations:
                summary: Loki info warning per minute rate > 0
                message: 'Loki warning per minute rate > 0 container:"{{`{{`}} $labels.container {{`}}`}}"'

When I deploy in the logs the only messages relating to rules are:

stderr F level=info ts=2023-08-14T10:45:35.197751973Z caller=ruler.go:499 msg="ruler up and running"
stderr F level=info ts=2023-08-14T10:45:35.197560291Z caller=client.go:255 msg="value is nil" key=rulers/rulers index=1
stderr F level=info ts=2023-08-14T10:45:35.197546357Z caller=basic_lifecycler.go:261 msg="instance not found in the ring" instance=loki-backend-0 ring=ruler
stderr F level=info ts=2023-08-14T10:45:35.197098711Z caller=module_service.go:82 msg=initialising module=ruler
stderr F level=info ts=2023-08-14T10:45:34.181637383Z caller=mapper.go:47 msg="cleaning up mapped rules directory" path=/var/loki/rules-temp

I can see that the rules.yaml file is correctly found in /var/loki/rules/fake, however there is no /var/loki/rules-temp folder.

No alert is triggered and no error messages appear in logs

Why is my alerting not working?

I am a couple of minor version behind, and I am not using the new backend architecture yet, so I my be a bit off. But couple of things I’d recommend you to try:

  1. The link you were following is also a bit behind. In the latest version of Loki ruler is now running in the backend target.

  2. Ruler can pull rules from either local storage or S3, so make sure you place your rules file accordingly.

  3. The directory to the rule files is tenant specific, and should be placed in <CONFIGURED_DIRECTORY>/<ORG_ID>/rules.yml, where the ORG_ID is fake if auth_enabled is set to false.

3 Likes

Thank you!

This is exactly the problem, the ruler is now running in backend, and by switching the volumes mounts from read to backend I can now see the alerts appear in Alertmanager.

      backend:
        replicas: 1
        extraVolumeMounts:
          - name: rules
            mountPath: "/var/loki/rules/fake"
        extraVolumes:
          - name: rules
            configMap:
              name: loki-alerting-rules

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.