How to always show sparse data from prometheus

Hi,

I have been driven crazy by this problem today. Here it is.

First of all, the Grafana version is 5.1.2.

My prometheus data is scraped every 1 hour, which is sparse data, I think. E.g., it is collected at 9/12 13:50:15, and next will be at 9/12 14:50:15.

On Grafana with prometheus datasource, I can see the data if I select the time range as “last 6 hours”, which makes the interval on the graph, according to my screen resolution, 30s. (To my understanding, the 30s is the very interval of query that Grafana prometheus datasource sends to Prometheus server, right.)

The data shows like the following picture.

The data is discrete. Every hour, when it appears, there are 10 points spanned in 5 minutes. The 5 minutes range starts like 9/12 13:50:07, and ends like 9/12 13:55:37. (The time is approximate). That means the data can be found between xx:50 and xx:56 in every hour.

It makes sense so far. I have no question. But If I change the time range to like “last 7 days”, the data sometimes appears, sometimes doesn’t. The refresh interval is 1m.

If my local time is within xx:50-xx:56, It can show the data, otherwise no.

Actually, I understand why there is no data if my local time is not within xx:50-xx:56. Following is my understanding.

Changing to “last 7 days” makes the graph interval (i.e. the query interval) 15m. So if my local time is like 19:15 when Grafana starts to refreshing, the start time of query would be 19:15 of 7 days ago. Today is 9/12, so it would be 9/5 19:15. The subsequent queries would be 9/5 19:30, 9/5 19:45, 9/5 20:00, …, 9/12 13:15, 9/12 13:30, 9/12 13:45, 9/12 14:00, … You can see no query falls between xx:50 and xx:56. So no data can be retrieved by Grafana prometheus datasource. It misses all of the data points “just like perfect”.

I can prove my understanding. Here it is.

If I choose the time range to “9/5 00:00 to now”, and I set the time shift of the graph panel to 7m. I also be able to see the sparse data. Because this time range makes the graph(query) interval 20m, and time shift 7m makes the starting query time is 9/4 23:53 (which is fixed, not related to my local time). So the query would occur at 9/4 23:53, 9/5 00:13, …, 9/12 13:13, 9/12 13:33, 9/12 13:53, …, 9/12 14:53, …, 9/12 15:53. As you can see, there are queries occurring at xx:53 in every hour, which is between xx:50 and xx:56. That’s why I am able to see the sparse data under this settings.

So now, If I understand it right, my question is, how can I get the data shown whatever the time range I select, or the question is like, how can I make Grafana query data at some certain time no matter what the query interval and query starting time is, like between xx:50 and xx:56.

Thank you very very much in advance!

Have you configured scrape interval to 1 hour?

Hi, thank you for replying.

If you are asking about the Prometheus, yes, I set the scrape interval to 1 hour in Prometheus config.

Talking about configuration of datasource in Grafana, see documentation. You should set Scrape interval there to 1h as well.

Hi @mefraimsson

Yes, I also have set the scrape interval to 1h in Prometheus datasource configuration.

To my understanding, the scrape interval in Prometheus datasource configuration will finally affect the “Min step” in “Metrics” tab of graphic panel configuration.

Whether “Scrape interval” and “Min step” is same thing or not, I tried them both anyway before you ask.

Here is a screen record, I hope it can show you what is happening when I change the “Scrape interval” or “Min step”.

1

Some explanations of the gif.

  1. First, time range is “last 6 hours”, and the “Scrape interval” of the Prometheus datasource I was using is 15s. The both settings make the “Min step” 30s automatically. (And the actual interval on the graphic panel is 30s. I can see that if I move the mouse cursor on the data point. And I think it is the actual request interval). In this case, the sparse data which occurs from xx:50 to xx:55 in every hour could appear in the panel always, because the interval is small enough.
  2. I manually changed the “Min step” to “1h”, (time range is unchanged), then the data disappeared.
  3. I removed the “1h”, let Grafana determine the “Min step” itself again. So the “Min step” got back to “30s”, and the data appeared again.
  4. Then I selected another Prometheus datasource, which is almost same with the previous one except the “Scrape interval”, which “1h” for the new datasource. (You can tell from its name, which is “Prometheus-1h”). After selecting “Prometheus-1h”, the “Min step” was changed to “1h” automatically (as expected), and the data disappeared (as expected).
  5. I changed back to the previous Prometheus datasource, so the “Min step” got back to “30s”, and the data appeared again.

At the end of the gif, I showed you my local time, which was xx:43. So when the “Min step” or “Scrape interval” was “1h”, and the time range was “last 6 hours”, the request would always occur around xx:43 in every hour, none of which falls in xx:50-xx:55, the time range when the sparse data occurs, so the data disappeared.

I can see that you’ve filtered on the series named “Total” in the legend - only showing that series - does it make any difference showing all series?

Would it make any difference display as points instead of line (Display tab -> enable Points)?

“Total” is just my data metric name, it is not a filter or aggregation. Like the “Revenue” and “Online”. They are three kinds of data metrics collected by Prometheus in every 1 hour. I hided the other two just in order to simplify the problem description.

Here is the complete Prometheus metrics expression.

fcldmgr_device_stats{instance="...",job="stats",type="Total"}
fcldmgr_device_stats{instance="...",job="stats",type="Revenue"}
fcldmgr_device_stats{instance="...",job="stats",type="Online"}

And in the previous gif, what you saw is already “point”, not “line”.

Here is a new screenshot, in which all three metrics are shown, and you can see I already set Grafana to display “points”.

Hi!

You got pretty close. The crucial implementation detail here is that Prometheus looks only at ts-5m for available data. So when your query range gets divided into small intervals by step, those interval borders (ts) can be too far away from the stored datapoints.

In the example where you entered 1h as the min, you forced a step of 1h, which can only show data during those 5 mins of an hour for which a scrape happened. So not putting a high min step fixed this problem for you on short time ranges.

However, when you look at a long time range, Grafana calculates a high step for you to reduce the number of datapoints that prometheus returns. This is expected behavior. Depending on your window width, you will get to a point where this breaks down with sparse data.

To make your use case work, Grafana would need to support a max step, so that if a high step is calculated and it is higher than the max step, the max step will be used. (In your case it should be less than 5 minutes.)

I suggest you open an issue in the grafana repo to describe this exact case, perhaps others need this too. Until then your only option is to use a wide enough display to allow for a low automatic step.

Hope this helps.

1 Like