Help with Prometheus rollup queries

Hi all,

I could use some help with what seems like a simple use case against a Prometheus data source.
I have a memory usage gauge that tracks bytes used:
pod:meter_memory_usage:joined{...}[3m]
2284568576 @1598513884.967 2284568576 @1598513914.967 ...

I would like to display a bar graph of hourly, daily, or weekly rollups of those values. For an hourly interval for example the values displayed should be
avg_over_time(pod:meter_memory_usage:joined{...}[1h]) between 9am and 10am avg_over_time(pod:meter_memory_usage:joined{...}[1h]) between 10am and 11am

I have been experimenting with steps and intervals but I can’t figure out how exactly those work against the Prometheus datasource. If I use
avg_over_time(pod:meter_memory_usage:joined{...}[$interval])
as a query I am getting a smoothed out graph with too many steps.
($interval is an Interval variable I defined on the dashboard)

If I additionally set “Min step” in the query editor to $interval, the graph kind of looks right, but I can’t convince myself that it is accurate… basically because I am not sure I understand how interval, step, resolution and related variables interact?

Any guidance would be greatly appreciated!
Thanks

Edit: I just confirmed that the query above doesn’t yield the right results. With interval =1 hour I get hourly totals between 50G and 100G for a given day, and if I change interval to 1 day I am getting a single 250G value :frowning:

The query_range Prometheus query made by grafana looks correct:

sum by(...) (avg_over_time( pod:meter_memory_usage:joined{...}[1d]))&start=1598392800&end=1598392800&step=86400"

=> 48.26Gi displayed in Grafana

… however it doesn’t match the result of this PromQL query:
sum by(...) (avg_over_time(pod:meter_memory_usage:joined{...}[1d] offset 122863s))/(1024*1024*1024)
=> 4.06Gi

with 122863s = now() - start

What am I missing??

I figured this out, this has to do with the fact that Prometheus takes the [$interval] back from the start query parameter.
so vector[1d]&start=Jun26th 00:00. returns the stats for Jun 25th as the first sample.
With [1h] it returns the stats from June 25th 23pm to June 26th 00:00.

So I’ll need to find a way to dynamically shift the range/offset in Grafana based on the interval, ouch.

1 Like