Hi there, we’re using Grafana v10.0.2 (b2bbe10fbc) with unified alerting. We have 2 pods hosting Grafana on our Kubernetes cluster. Grafana seems to struggle with CPU and memory after we added 500+ alert rules. Scaling out to more replicas doesn’t help, so we could only increase the memory. Normally the CPU usage was 0.2, and memory around 200MiB, but after adding the alert rules, CPU usage is around 3 cores and memory usage is around 1.2GiB. Based on this code, i suppose each Grafana instance would fetch all alert rules from DB and evaluate them in parallel, so adding instances wouldn’t help? I’m wondering if this is expected resource usage for unified alerting. Can we only scale up the pod?
Are you using sqlite as Grafana’s database?
No, we’re actually using Postgres.