Working configuration example for alerts templating - Telegram and Slack

Hi Grafana community!

I have spent a lot of time to configure templates of my alerts, a lot of time lost with all recent updates since Grafana 8 so I wanted to share here my templates, hoping it can help members.

Important: this configuration is running on Grafana v9.3.2 and may not work with an earlier version.

Alert rules

The following alert rules apply to a Prometheus datasource where I have added a custom label “node” to each target.
If you are using a different datasource, you have to adapt these.

Alerts are set with latest Grafana defaults:

  • A = query
  • B = Reduce last of A
  • C = Threshold of B - set as Alert condition

Summary and annotations

  • Summary: summary of the alert
    example: CPU consumption is high
  • AlertValues: scraping values from query B to show value in alert
    example: CPU usage: {{ $values.B }} %
  • Node: value coming from a custom label to get the instance concerned
    example from my P: {{ $labels.node }}

Templates

In the following templates, you have to replace fields {grafana_URL} and {dashboard_id/name}.

Slack message templates

slack_title

{{ define "slack_title" }}
  {{ if gt (len .Alerts.Firing) 0 }}
  🔥 {{ len .Alerts.Firing }} alert(s) firing
  {{ end }}
  {{ if gt (len .Alerts.Resolved) 0 }}
  ✅ {{ len .Alerts.Resolved }} alert(s) resolved
  {{ end }}
{{ end }}

slack_message

{{ define "slack_message" }}
  {{ if gt (len .Alerts.Firing) 0 }}
    {{ range .Alerts.Firing }} {{ template "slack_alert_firing" .}} {{ end }} {{ end }}
  {{ if gt (len .Alerts.Resolved) 0 }}
    {{ range .Alerts.Resolved }} {{ template "slack_alert_resolved" .}} {{ end }} {{ end }}
{{ end }}

slack_alert_firing

{{ define "slack_alert_firing" }}
  Severity: *{{ .Labels.severity }}*
  *{{ .Labels.alertname }}*
  {{ .Annotations.summary }}

  Node: *{{ .Annotations.Node }}*
  {{ .Annotations.AlertValues }}
  <https://{grafana_URL}/d/{dashboard_id/name}|Access dashboard> - <https://{grafana_URL}/alerting/silence/new?matcher=node%3D{{ .Annotations.Node }}|Silence all alerts for this node> - <https://{grafana_URL}/alerting/silence/new?matcher=alertname%3D{{ .Labels.alertname }}|Silence this alert for all nodes>{{ end }}

slack_alert_resolved

{{ define "slack_alert_resolved" }}
  *{{ .Labels.alertname }}*
  
  Node: *{{ .Annotations.Node }}*
  {{ .Annotations.AlertValues }}
  <https://{grafana_URL}/d/{dashboard_id/name}|Access dashboard>
{{ end }}

Telegram message templates

telegram_message

{{ define "telegram_message" }}
  {{ if gt (len .Alerts.Firing) 0 }}
  <b>🔥 {{ len .Alerts.Firing }} alert(s) firing:</b>
    {{ range .Alerts.Firing }} {{ template "telegram_alert_firing" .}} {{ end }} {{ end }}
  {{ if gt (len .Alerts.Resolved) 0 }}
  <b>✅ {{ len .Alerts.Resolved }} alert(s) resolved:</b>
    {{ range .Alerts.Resolved }} {{ template "telegram_alert_resolved" .}} {{ end }} {{ end }}
{{ end }}

telegram_alert_firing

{{ define "telegram_alert_firing" }}
  Severity: <b>{{ .Labels.severity }}</b>
  <b>{{ .Labels.alertname }}</b>
  {{ .Annotations.summary }}

  Node: <b>{{ .Annotations.Node }}</b>
  {{ .Annotations.AlertValues }}
  <a href="https://{grafana_URL}/alerting/silence/new?matcher=node%3D{{ .Annotations.Node }}">Silence all alerts for this node</a> - <a href="https://status.astar.network/alerting/silence/new?matcher=alertname%3D{{ .Labels.alertname }}">Silence this alert for all nodes</a>
  <a href="https://{grafana_URL}/d/{dashboard_id/name}">Access dashboard</a>{{ end }}

telegram_alert_resolved

{{ define "telegram_alert_resolved" }}
  <b>{{ .Labels.alertname }}</b>
  
  Node: <b>{{ .Annotations.Node }}</b>
  {{ .Annotations.AlertValues }}
  <a href="https://{grafana_URL}/d/{dashboard_id/name}">Access dashboard</a>{{ end }}

Contact points

Slack
Title = {{ template "slack_title" . }}
Text Body = {{ template "slack_message" . }}

Telegram
Message = {{ template "telegram_message" . }}

Notification policies

Set contact points with criteria you need, tick “Continue matching subsequent sibling nodes”.

I hope these example can help builders struggling to setup alerts templating, please let me know in answers.

12 Likes

Great job @bLd759 ! These templates are great and should be very useful to others.

3 Likes

This is awesome!! My idea is this should be part of grafana as stepper wizard that fill in this code for you based on parameters you select such message host type: team, slack etc

3 Likes

Hey @bLd759,

This is really great :sunglasses: and many community users keep asking for this. I will add more tags so that it gets more visibility while searching.

Thanks and if you have more examples then keep sharing them !! :pray:

1 Like

See the initial message posted by @bLd759 which contains the actual and complete solution.

I would very much like to implement these Telegram templates, however being completely new at this, could someone please explain how to use these within Grafana Cloud?
I presume they are inserted under “Optional Telegram Settings” in the contact point, but how to you add all three?
Sorry for the basic question…

Thank you very much for your help @bLd759 ! This example has helped me to understand the use and application of the templates in Grafana V9.

really thank you very much

Good day.
I can’t figure out where to enter telegram message, telegram_alert_firing and telegram_alert_resolved

Sorry I don’t know about Grafana cloud I don’t use it.

This is just the template name when creating a new template.

Figured it out. But I don’t understand why it shows only the first message from all notifications

Breaking change: the latest Grafana version introduced a new Parse Mode parameter.
This causes error 400 - Bad Request on Telegram Webhook.
For Telegram contact point, you need to set Parse Mode to HTML.

Thanks @lexpec for identifying the bug and solution.

@usman.ahmad can you please give me back edit rights on the original post so I can update it with that?

1 Like

@bLd759

I moved it back to Alerting category as this category “How To” is only available for admins n moderators.

Please check as now you should have the permission to edit your article (otherwise let me know) :slight_smile:

I will keep it in Alerting category as I am happy that you are following around with the changes happening :+1: So will only create a link-up post in How To section to your original post.

1 Like

Hello @bLd759, @usman.ahmad

An import UPDATE on my issue:
As I researched it a little bit deeper I found out that my particular problem was not only the Parse Mode. But changing anything on the edit page of that particular contact point fixed the error, but just for some time or partly (so some alerts literally squeezed through the limits some not). So, sending a test notification, changing the parse mode or a template - any action was making the contact point “alive” again.

After some days of searching I came across this issue, where a user faced with a similar problem: Webhook response status 400 Bad Request and provided a detailed and structured review with logs and reflection.
The point is that too many alerts generate a too large text message, which Telegram’s API can’t handle. The limit is 4096 symbols.

I checked that on my alerts removing a .ValueString part from the alert template, leaving just .Labels.alertname and that template unloading solved the problem permanently.

It turns out, that changing a Parse Mode may help, but first of all it is necessary to check if your Grafana is not sending too many alerts to Telegram at once (20+ in my case) so it doesn’t exceed the limit.

This is my template now for the contact points where a bunch of alerts is to be sent to the Telegram:

{{ define "tgbodydd" -}}
  {{ range . -}} 
    {{ with .Labels.alertname }}<b>{{ . }}</b>{{ end }}
  {{ end }}
{{ end }}

{{ define "tgdd" -}}
  {{ with .Alerts.Firing }}⚠️{{ template "tgbodydd" . }}{{ end }}
  {{ with .Alerts.Resolved }}✅OK{{ template "tgbodydd" . }}{{ end }}
{{ end }}

Unfortunately, the alerts using that template won’t be so informative as they were, but I am sure that won’t miss anything.

2 Likes

Hi @bLd759 ,

You have mentioned that you have added a custom label called “node” to each target. Could you please tell me how you got created that label and where.

Thanks!

Thank you @usman.ahmad unfortunately I’m still not able to edit original post.

Thank you @lexpec this is very useful, I may have been in this case actually.
I’ll edit the original post with a warning as soon as I get edit rights on it!

1 Like

Sure, the {{ $labels.node }} come from Prometheus config file, you can add custom labels following this syntax.
I’ve set the label node but it can actually be anything.

    static_configs:
      - targets: ["xxx.xxx.xxx.xxx:xxxx"]
        labels:
          node: 'node-01'

Can you please record a screencast and share it with me? I am not sure as why this is happening to you. I would like to help here to make it work again.

A quick question on replacing {dashboard_id/name}. - how can I do that?

I’ve tried replacing it with the Dashboard ID annotation as {{ .Annotations.dashboardId }} but that doesn’t work. Any ideas what I can try?