DatasourceError, DatasourceNoData custom routing

asary · November 12, 2024, 11:43am

My pain
In Grafana, I have one alert configured that can go into firing, No Data, and Error states. I need to have:

firing is sent to a common telegram channel (important for decision making).
No Data and Error to be sent to another channel only for me (required for technical analysis).

Now all states go to the same channel since one label and contact point is used.

My attempt
I tried a variant using templates to distribute alerts. I created two templates: one that allowed only firing state, and one that allowed only No Data and Error. This partially worked, but there was a problem. Because of the No Data and Error states in one template and the firing state in the other, I get the following error: Telegram webhook response status 400 Bad Request.

Here is an example template:

{{ define "tgbody_crit" -}}{{ range . }}
{{ if eq .Labels.alertname "DatasourceNoData" }}

{{ else if eq .Labels.alertname "DatasourceError" }}

{{ else }}<b>{{ .Labels.alertname }}</b>
{{ with .ValueString }}{{ reReplaceAll "[[][^]]*metric='{?([^}']*)}?'[^]]*labels={([^}]*)}[^]]*value=(-?[0-9]*[.]?[0-9]+([eE][+-]?[0-9]+)?)[^]]*](, )?" "$2\n$1: <b>$3</b>\n\n" . }}{{ end }}
{{ range .Annotations.SortedPairs }}{{ .Name }}: {{ .Value }}{{ end }}
{{ with .GeneratorURL }}✏️<a href="{{ . }}">Edit</a> × {{ end }}{{ with .PanelURL }}📉<a href="{{ . }}">View</a> × {{ end }}{{ with .SilenceURL }}🔕<a href="{{ . }}">Mute</a>{{ end }}
{{ end }}
{{ end }}{{ end }}

Please help me to configure routing by alert states to distribute them to different channels

jangaraj · November 12, 2024, 12:18pm

DatasourceError, DatasourceNoData error alerts generate special alertnames, which you can use in your notification policy (order is important, of course Continue matching subsequent sibling nodes must be also disabled). My example - I’m dropping all these noisy alerts (but I use alert state history to monitor these problems, so I care about health of my alerts)

asary · November 13, 2024, 10:31am

Thank you very much for the tip!
I’ve already run several tests, and it works!

cretin · November 20, 2024, 4:08pm

So the label is literally “alertname” and its value is “DataSourceError” and “DatasourceNoData” respectively?

jangaraj · November 20, 2024, 4:33pm

Yes, it’s that literally.

cretin · November 20, 2024, 4:48pm

And then all the other alarms are set to “Normal” in your case for datasource no data and error?

jangaraj · November 20, 2024, 4:55pm

What is “Normal”? Set error, no data:

cretin · November 20, 2024, 5:16pm

It’s here in the documentation:

The definition is not great:

Sets alert instance state to Normal.

Maybe it just means that it’s not doing anything. So, in contrast to “Keep Last State”, it won’t be firing, in case it had already started when a problem with the data source occurs. I’m not sure.

cretin · November 20, 2024, 6:23pm

I have another question though. It seems to me that your solution doesn’t actually solve the spam issue if you still want to receive alerts regarding datasource issues. So those matches wouldn’t limit the number of alerts, it’s not as if you’re getting just one alert for one datasource issue, right?

And in your case you’re going to see the spammy information somewhere else anyway (in the alert history). It’s not necessarily a bad solution, but I would really want to know about datasource issues in a channel that I normally follow and I wouldn’t want to check additional things on a regular basis. And, of course, I don’t want the channel to be spammed

jangaraj · November 20, 2024, 7:10pm

My case: I have manage instance where I as an admin define datasource and users defines own alerts.
What’s a point to send DatasourceError, DatasourceNoData to users? Nonsense, because they don’t have idea about infrastructure behind then. So I hijack all these DatasourceError, DatasourceNoData alerts away from the users - they are not angry about noise and they receive only genuine alerts from their perspective.
But I as an Grafana Admin wants to know about DatasourceError, DatasourceNoData. But I don’t want to receive millions notifications about it. So I sent all these alert notifications to “/dev/null” - I completely ignore them. (Feel free to forward them to own channel if you want. It depends on the number of alert, but I bet you will have notifications-blindness at some point.)
I have own alerts on alerts metrics. It’s normal to have occasional errors/nodata issue (because network glitches, …), so I’m getting alerts, when it is serious (e.g. it lasts for 10+ minutes).

Topic		Replies	Views
Go template option to show the error message for DatasourceError Alerting templating , alerting	1	117	May 16, 2024
Alerting Notification Policy - DatasourceError / DatasourceNoData Grafana Cloud alerting , alert-notifications	5	4832	February 4, 2022
Working configuration example for alerts templating - Telegram Alerting alert-templating , telegram	2	2941	May 19, 2024
Alert notification - alertname and rulename Alerting templating , alerting	2	1280	May 22, 2023
How to prevent Grafana from sending notifications with nil, No Data values? Alerting	1	54	March 12, 2025

DatasourceError, DatasourceNoData custom routing

Related topics