Grafana not sending pagerduty at desired frequency on detection

Hi - My observation is that Grafana is not sending pagerduty on every alert that was raised.

I have a rule that fires on every test success and triggers pagerduty. Test is successful every 10 mins. The following is the behaviour:

Alert

folder: abc
Evaluation Group: abc-success
Rule group evaluation interval: 1m
Evaluate for: 5m

policies:

  • orgId: 1
    receiver: slack
    group_by:
  • grafana_folder
  • alertname
    group_wait: 30s
    group_interval: 3h
    repeat_interval: 8h
    routes:
  • receiver: slack
    object_matchers:
  • [‘notificationRoute’, ‘=’, ‘slack’]
  • receiver: pd
    object_matchers:
  • [‘notificationRoute’, ‘=’, ‘pagerduty’]

There is only 1 rule in grafana folder and state shows it flip flops between alerting and normal every so often within 1hr.

In this case my expectation was that the first alert would be sent out at 30s, re-alert when condition persists would happen on 8hr:15min (however there is flip flop) and then a realert every 3hr 15 mins (due to flip flop).

In my case PD alerts were being sent out anytime within 7hr - 9:30 hrs apart. What could be causing this issue.

I’m afraid I’m not sure I follow the expectations as written, but this is how it should work:

  1. The Group Wait is 30 seconds, which means the first notification will be delivered in 30 seconds.
  2. The Group Interval is 3 hours, so if the alert is resolved after 3 hours then there will be a resolved notification. However, if the alert is still firing after 3 hours, no notification will happen until the Repeat interval, which is 5 hours later.
  3. If the alert is still firing in 9 hours time, then a repeat notification will be sent. While the repeat interval is 8 hours, it’s not a multiple of the Group interval, so it will happen at the next Group interval instead (3 hours x 3 intervals = 9 hours).
1 Like