My goal is to display a histogram (or table) of log counts, with each bar (or row) representing 1 day.
I’ve tried the following query:
sum by(filename) (count_over_time({env="development"} [$__auto]))
Q1. In a test local instance with only 1 log, if I change $__auto to 24h, I see a count of “1” for every 15 minutes over a 24h period. Is this because it does a rolling average?
Ideally I would have 1 row for each day (not 15m) with a count of 1 for the day the log happened and 0 for other days in my lookback window. For those familiar with splunk, I’m trying to do bucketing.
Example output as table:
Time |
Filename=x |
|
|
2024-09-25 01:30:00 |
1 |
|
|
2024-09-25 01:45:00 |
1 |
|
|
2024-09-25 02:00:00 |
1 |
|
|
2024-09-25 02:15:00 |
1 |
|
|
2024-09-25 02:30:00 |
1 |
|
|
… |
|
|
|
Q2. If I change $__auto to 1m, I get no data, despite having a large lookback. Why is this? In my case I think auto is 10m. Is it because of a resolution issue where it’s related to the number of points you can put on a graph when having a long lookback?
I did see this article which says it should be at least 4x the scrape interval but I assume that’s for ‘rate’. New in Grafana 7.2: $__rate_interval for Prometheus rate queries that just work | Grafana Labs.
If that’s the issue, why not display an error?
A log range aggregation is a query followed by a duration. A function is applied to aggregate the query over the duration. The duration can be placed after the log stream selector or at end of the log pipeline.
See Metric queries | Grafana Loki documentation.
This means that when you use [24h] your query is aggregating metrics for the past 24 hours. The other aspect is, how often is this being done? That is outside of the query itself and is often calculated automatically. For example, if you are using Grafana you can adjust this under query option. And say if your dashboard time window is now-1h and your maximum number of data points is 60, then your query would be executed 60 times (technically 61 if i remember correctly), with a 1-minute step between each other, so you’d get the past 24 hours of metrics, stepping forward 1 minute at a time, until 1h window is reached.
So, what you want is:
- Use [24h] in query
- Set minimum interval in query option to 24h
- And perhaps set dashboard time window to something like 7 days
Thanks! Why not display an error if the log range or minimum interval is invalid (more points than max points)? It seems to just say ‘no data’
At a glance it looks like specifying the query option will work. How come minimum interval doesn’t default to the log range interval [24h]? It seems like that wouls be the more common use case.
Not sure if I understand what you mean.
Because they are not the same thing.
A metric query can be performed once or multiple times, regardless of the range specified. You can perform a [24h] query once, or you can perform it 60 times, each time 1 minute apart. The result is you either get 1 set of metrics, or 60 sets of metrics, but the query itself is identical.
For the first reply, I mean, if I have a 7d lookback, the query options sets the interval to 15m due to max data points. If I then force the query to run with [5m], it doesn’t give an error. It says ‘no data’. There should be data, it just can’t resolve it and display it properly from what I understand.
For your second comment, you explained the difference well in the initial reply. My comment was just that I see the more common use case to not have overlapping intervals so I’m surprised the default is to have min-interval at ‘no limit’ rather than equal to the log range interval. I don’t see much of a use case for a 24h lookback at 1m intervals (other than maybe rolling averages).