Grafana out of memory issue

krishnaprasadas1 · April 28, 2023, 6:30pm

Hi team, I’m new to Grafana. I’m trying to create alert rules on Grafana with TimescaleDB data source, My alert is a basic one which takes last 12 hours data from TimescaleDB hypertable and group on one column to find the sum (find attachment) and if it goes beyond 4 raise an alert.
But when I create this rule and data flows into the source table grafana starts consuming memory without any limit and garbage collection and goes beyond 16GB and fails at some point and restart the pod. The error is Out of memory error.
Please let me know what am I doing wrong.

yosiasz · April 28, 2023, 6:43pm

Welcome @krishnaprasadas1

how much ram do you have on the server running grafana. what happens if you ran the same query outside of grafana and how much mem does it consume when running it outside grafana

krishnaprasadas1 · April 28, 2023, 7:07pm

It has ~32GB ram I guess (I don’t have the exact number with me now since its handled by another team) but I can see the usage goes beyond 20GB from the attached image, The data size in the timescaleDB is a total of ~400MB (2 million rows). I just tried the query in psql but its using minimal memory only less than a GB. What Im confused is why the memory goes on increasing and not coming down at any point of time until it throws OOM error. I tried the alert rule with reduce expression and without reduce but no luck.
One more thing is that, the same sized data is being used in another grafana with InfluxDB data source there it runs perfectly without much memory footprint(~200MB only). PFA alert rule created for influxDB.

yosiasz · April 28, 2023, 7:34pm

ok can you now please try the same query in your basic bar chart and not within an alert and see if it pegs the mem?

trying to eliminate the issue being with alerts + TimescaleDB

krishnaprasadas1 · April 29, 2023, 12:36pm

Tried the same query in dashboard but its just using around 350mb of memory.

yosiasz · April 29, 2023, 12:51pm

But did you also somehow implement for it to run every minute for 1 hour snd do a reduce transformation for the last sum > 4?

jangaraj · April 29, 2023, 1:16pm

What are your Grafana pod requests/limits memory limits?

I believe Grafana pod is able to see 32GB of memory, so there is no pressure for Golang to run garbage collection. But meanwhile you will reach some K8S mem limit and pod is terminated because OOM.

You may try to set env variable GOGC=10 for Grafana pod to modify default GOGC=100 behaviour. Or better is to modify GOMEMLIMIT env variable, so Golang app (Grafana in this case) will be aware how much memory has available. You can use K8S Downward API.

krishnaprasadas1 · May 2, 2023, 5:40pm

Set the GOMEMLIMIT to 3GB and pod limit to 4GB, now there is no restart happening, thank you. But it still uses the maximum memory available and does the GC in regular interval, Im still not sure why grafana uses these much memory for a simple alert rule. Is there any optimizations to be done on my alert configuration? Why Im asking is the same alert is running with 350mb of memory on InfluxDB(with same data), and once I changed it to run on timescaleDB the memory use is too high. I’m expecting an alert query optimization in the rule may improve the memory usage, which I’m not sure How to do. The options in the query builder works differently for different data sources.

jangaraj · May 2, 2023, 7:21pm

try to use profiling and analyze mem/heap usage. But that’s very low level access to Golang app.
You need to know if it is worth it - a few hours of your profiling vs price of 4 GB memory.

krishnaprasadas1 · May 11, 2023, 4:41pm

Thank you guys for the support. the issue got resolved when I changed the $__timeFilter(time) where condition to the actual time condition like time > NOW() - interval '3 days'. Sharing if it helps someone.

Topic		Replies	Views
Grafana runs Out of Memory when querying longer timeframe dataset Grafana	11	6673	May 28, 2022
CPU and memory usage of Grafana surge after adding more alerts Alerting alerting	2	909	October 16, 2023
Grafana alert for cpu/memory usage Prometheus alerting	0	5909	April 27, 2022
I need sample alerting rule Grafana alerting , alert-templating , alert-notifications	5	3177	August 14, 2023
In the alert panel, the array size of the query result is not match with the database InfluxDB alerting , influxdb , query-help	7	1291	January 17, 2022

Grafana out of memory issue

Related topics