Querier memory leak

During query execution the queriers RAM go up as expected, however after the results were returned to grafana the RAM doesn’t go down. When running the exact query again the RAM maintains its value.

I’m running Loki on a vm deployment.

I’m assuming (but not sure) that queriers keep chunks for queries in memory and never get rid of it.

Loki was meant to be deployed in kubernetees where a memory limit is much easier to enforce on a container. When a container shuts down due to an OOM another one quickly takes its place. So never getting rid of chunks in memory makes sense I guess, however I don’t have the ability to deploy Loki in kubernetees and looking for a solution fitting to a vm deployment.

Any help would be greatly appreciated.

You can use systemd parameter in unit:

  • for restart => Restart=always or Restart=on-failure
  • for limit => MemoryMax=
1 Like

Thanks for taking the time to answer.

This will probably be our go to temporary solution.

We did however misunderstand some querier and frontend configurations that their change made Loki perform way better. I’ll update here if the problem reoccurred with the new configuration when we get some data for challenging queries.

We are facing the same issue on our k8 setup. Any specific changes to the querier and frontend configurations? And by how much percent was Loki performing better?

Thanks

1 Like

The problem was a configuration one, it’s actually kinda silly.

frontend_worker:
  parallelism: This should at most the number of cpu cores in your querier.

This fixed the problem I mentioned in this blog post, we were starting to get consistnet results. The RAM went down after executing queries and we were getting faster results. We misunderstood what this configuration meant which was quite foolish from our end.

However our querying times (metric queries) are still extremely slow. Without caching, presenting dashboards of Loki metric queries in Grafana is nearly unusable. We are still investigating this issue.

Moreover in the recommended production configuration it is recommended to set:

parallelism: CPU's in queriers /  frontend instances

We have 2 frontends. I’m not sure how exactly the worker and frontend relation work but when we used more than half of the querier CPU’s in the parallelism (half because we have 2 frontends) we weren’t getting faster results despite CPU usage going up which is weird. We decided for now to stick with half the querier CPU’s for this configuration.

Thank you.
I guess you are using the distributed loki. We were using the loki monolith, hence the parallellism configs did not help much. So we are trying out on the distributed loki now.

1 Like

Yes I am running distributed Loki.
If you are running an all in one deployment this memory behavior might be because of the ingester.
I would suggest taking a look at the following configurations:
chunk_retain_period
chunk_target_size