I have created alarms on 8.X versions that works properly, but since we’ve updated Grafana to version 9.4.2 last week, it seems the alarms I’m creating on this new version are not working at all. I’ve confirmed that the contact points (e-mail and Discord) does because the test notifications arrives. Also, the alarm fires properly, but no notification is being sent to e-mail and/or Discord.
I spent some hours looking for PRs about this and also for similar problems in Grafana’s changelog and forum, but found out nothing. Therefore, I suppose I’m doing something wrong and I would like some help regarding it, please.
The alarm I’m trying to create is a PostgreSQL query like this one (query A):
select count(*) from tablename t where t."createdAt" between now() - interval '30 minutes' and now()
I have a query B reducing the last input from A in strict mode. I also have a query C evaluating if the input from B is above 0.
Folder selected for the alarm is “Test”. Evaluation group is a new one created only for this alarm, which is being evaluated every 2m for 4m. I also added some description and summary test text.
My match label is test=active.
My configuration for the notification policy test=active have the option continue matching subsequent sibling nodes activated and the other ones disabled. No mute timing is selected.
What am I doing wrong?
Thanks a lot,
Have you configured notification policy for you alerts? Manage notification policies | Grafana documentation
Your default policy (I assume you don’t have other policies) needs a valid contact point. Check if the contact point you configured is properly connected to the notification policy
Notification policies are responsible for routing firing alerts to appropriate contact points
Hi, @konradlalik, thanks for your answer!
I have configured the notification policies for my alerts. I also have other policies created before updating to version 9.4.2.
I’ve created a mock for this discussion:
Do you notice anything in the logs? Do those show you any errors? Can you check in the contact points page if there are any errors reported there for the Discord contact point you chose?
Perhaps also check the timing options of the default policy (or the parent policy when using nested policies) as those may also affect when the alert is delivered.
Hi @gillesdemey1, thanks for the help!
I did not find any errors on the contact points page. Also, the result is the same if I choose my e-mail as a contact point: the test notification works, but not the notification for the alarm. I took some screenshots of the contact points page:
About the logs, I found these two:
Are these of any help?
Thanks in advance,
I digged a little deeper into the logs and found out some of them returning 404.
logger=context userId=53 orgId=1 uname="Raphael de Taranto" t=2023-03-20T17:22:19.510770844Z level=info msg="Request Completed" method=GET path=/api/v1/ngalert/admin_config status=404 remote_addr=188.8.131.52 time_ms=173 duration=173.807527ms size=59 referer=https://grafana.bgcbrasil.com.br/alerting/admin handler=/api/v1/ngalert/admin_config
I noticed a notification policy configurated as “A != B” redirecting all the new alarms I’ve created since migrating to 9.4.2 to it’s contact point (e-mail). In the email sent by this contact point, I noticed two messages: the first one was from the original alarm (with the matcher “A != B”) and the second one was from the new alarm that shouldn’t be there.
I simply deleted the “A != B” notification policy and it’s alarms and now the problem is solved. Now I’m recreating the “A != B” alarms and everything seems to work.
I know this was not the best way to deal with it, but I had a deadline to find a solution and I didn’t want to rollback to 8.X versions.
Thanks, Konrad and Gilles, for trying to help.