Count_over_time query causes OOM on loki-read pod with 8GB limit

mazmar · May 20, 2025, 9:38am

Hi everyone,

I’m running into a recurring issue with Loki (v3.3.0) in a multi-tenant setup. We store over 600GB of logs across multiple tenants spanning 6 months of retention.

When running the following query on one of the tenants:

count_over_time({env="prod-1"}[15m])

…it crashes the loki-read pod due to OOM , even though we have 8GB memory limit set for that container.

This happens fairly consistently under queries involving even short time windows (e.g. 15 minutes), particularly when using count_over_time.

Our setup:

Deployed using Helm with simple-scaled mode
3x loki-read, 3x loki-write, 3x loki-backend (chunk cache)
Chunk storage : TSDB, backed by Azure Blob Storage
Index retention : 24h
Total log volume: ~600GB over 6 months

My questions:

What memory-related settings should I tune to allow this type of query to complete successfully?
Is there any way to limit the amount of data count_over_time pulls into memory?
Would running the query against a narrower log stream label set help, or does count_over_time still aggregate too much regardless?

Any best practices for query optimization or memory tuning in long-term, multi-tenant setups would be highly appreciated.

Thanks in advance!

tonyswumac · May 20, 2025, 6:11pm

In general you want more queries with smaller individual footprint. For example, if you have 3 reads with 8GB memory, it would be better to have 6 with 4GB memory.
Make sure you have query splitting configured and it’s working properly.
What’s the duration of your query? 1 day? 2 days? or more?

Topic		Replies	Views
Issue with count_over_time in Loki Version 3.x.x Grafana Loki	1	267	February 2, 2025
Count_over_time mismatch with range query result Grafana Loki	3	66	August 21, 2024
Memory usage implose when doing regular requests Grafana Loki	1	1964	June 1, 2022
Out of memory problem when using a longer time range Dashboards	0	325	July 26, 2023
Loki ingester killed OOM Grafana Loki loki	2	588	May 29, 2025

Count_over_time query causes OOM on loki-read pod with 8GB limit

Related topics