How to properly bucketize count_over_time queries

robertmiller1 · September 25, 2024, 5:19pm

My goal is to display a histogram (or table) of log counts, with each bar (or row) representing 1 day.

I’ve tried the following query:
sum by(filename) (count_over_time({env="development"} [$__auto]))

Q1. In a test local instance with only 1 log, if I change $__auto to 24h, I see a count of “1” for every 15 minutes over a 24h period. Is this because it does a rolling average?

Ideally I would have 1 row for each day (not 15m) with a count of 1 for the day the log happened and 0 for other days in my lookback window. For those familiar with splunk, I’m trying to do bucketing.

Example output as table:

Time	Filename=x
2024-09-25 01:30:00	1
2024-09-25 01:45:00	1
2024-09-25 02:00:00	1
2024-09-25 02:15:00	1
2024-09-25 02:30:00	1
…

Q2. If I change $__auto to 1m, I get no data, despite having a large lookback. Why is this? In my case I think auto is 10m. Is it because of a resolution issue where it’s related to the number of points you can put on a graph when having a long lookback?

I did see this article which says it should be at least 4x the scrape interval but I assume that’s for ‘rate’. New in Grafana 7.2: $__rate_interval for Prometheus rate queries that just work | Grafana Labs.

If that’s the issue, why not display an error?

tonyswumac · September 25, 2024, 8:58pm

A log range aggregation is a query followed by a duration. A function is applied to aggregate the query over the duration. The duration can be placed after the log stream selector or at end of the log pipeline.

See Metric queries | Grafana Loki documentation.

This means that when you use [24h] your query is aggregating metrics for the past 24 hours. The other aspect is, how often is this being done? That is outside of the query itself and is often calculated automatically. For example, if you are using Grafana you can adjust this under query option. And say if your dashboard time window is now-1h and your maximum number of data points is 60, then your query would be executed 60 times (technically 61 if i remember correctly), with a 1-minute step between each other, so you’d get the past 24 hours of metrics, stepping forward 1 minute at a time, until 1h window is reached.

So, what you want is:

Use [24h] in query
Set minimum interval in query option to 24h
And perhaps set dashboard time window to something like 7 days

robertmiller1 · September 26, 2024, 1:11am

Thanks! Why not display an error if the log range or minimum interval is invalid (more points than max points)? It seems to just say ‘no data’

At a glance it looks like specifying the query option will work. How come minimum interval doesn’t default to the log range interval [24h]? It seems like that wouls be the more common use case.

tonyswumac · September 26, 2024, 5:06am

Not sure if I understand what you mean.

Because they are not the same thing.

A metric query can be performed once or multiple times, regardless of the range specified. You can perform a [24h] query once, or you can perform it 60 times, each time 1 minute apart. The result is you either get 1 set of metrics, or 60 sets of metrics, but the query itself is identical.

robertmiller1 · September 26, 2024, 7:35pm

For the first reply, I mean, if I have a 7d lookback, the query options sets the interval to 15m due to max data points. If I then force the query to run with [5m], it doesn’t give an error. It says ‘no data’. There should be data, it just can’t resolve it and display it properly from what I understand.

For your second comment, you explained the difference well in the initial reply. My comment was just that I see the more common use case to not have overlapping intervals so I’m surprised the default is to have min-interval at ‘no limit’ rather than equal to the log range interval. I don’t see much of a use case for a 24h lookback at 1m intervals (other than maybe rolling averages).

Topic		Replies	Views
Grouping on time intervals Grafana Loki	1	1020	February 29, 2024
Total count of logs (not over time) Grafana Loki loki , query-help	11	303	January 23, 2025
`sum(count_over_time({...}[$__range])) group by (...)` counts an entry where there is none Grafana Loki	1	446	October 20, 2023
Aggregation metrics step size Grafana Loki loki , query-help	1	667	September 8, 2023
Wrong calculation of data (query help) Dashboards loki , query-help	0	423	December 30, 2022

How to properly bucketize count_over_time queries

Related topics