Should time range selected for query and alert evaluation interval match or relate for grafana alerts?

I am trying to create my first alert in Grafana with AWS CloudWatch data source. There are two settings related to time that needs to be configured.

The query section has the “Time Range”
image

Also, the “Set evaluation behaviour” section has the evaluation interval to be configured.
image

Do we need to consider any rules while setting these values, to ensure that there will be no duplicate alerts. For instance, should “Time range” be equal or less than the evaluation interval. Please advise.

Hi @jamessmithtech15

That’s a good question!

To ensure there are no duplicate alerts, Grafana uses a process called de-duplication based on a unique fingerprint. This fingerprint is essentially a hash of all the alert’s labels. When an alert notification is sent, it gets logged in the notification log.

During the next evaluation, Alertmanager checks this log. If it finds an entry for the same alert, it skips sending a new notification unless the alert’s labels have changed. This means the query itself doesn’t influence the de-duplication process—alerts are only sent again if their labels change or after the specified repeat_interval has passed.

In theory yes, but it depends.
I would keep your config for CloudWatch case, because CloudWatch can be slow - it may need a few minutes to publish metric. If that few minutes is 5+ minutes, then you will have nodata error if you keep both settings 5min. So overlap make sense in this particular case.
You may have also alert logic which may need longer query period. So it depends on the use case.

This is indeed a good topic! But I would like to take this further as I am confused :see_no_evil:

Let’s say we are generating metric every 2 minutes as below (ignore that it is going from 34 straight to 40, I will get this investigated):


In the alert we are taking 10min time range and doing evaluation every 5min, however, I am confused about the interval setting and the query that we are using. My questions would be:

  • What should be the interval knowing that metric is generated every 2 minutes? My guess would be that every 2nd graph point is just a duplicate of previous metric value, as new one haven’t been generated, so it interval should be the same as of how often metric is being generated (the same value starting from 34 to 39 just proves my guess).

  • Is the interval querying the source to get a new metric only or is actually doing any evaluation?

  • Is the query correct at all and works? We are trying to compare last 6 minutes baseline with the last 6 minutes upper band (which is 2x of baseline) and if current baseline is higher, then alert goes into pending state and after X minutes of pending period it fires an alert. How does the 10min time range works here then? What if we select time range of 1h for example?
    P.S. for this example I’ve made this “0.001 *”, so the graph range type would show some values in the graph. Normally we are using instant type.

Hope above is clear and if not, I will try my best to explain it again with more details as it is very important to understand this for me :slight_smile: