Using templates in Webhook Alert URLs

I am posting this in case someone finds it helpful, as I was unable to find good documenttaion on how to do it.

I want to have grafana use a webhook to forward alerts to healthchecks.io, as well as to send a heartbeat to healthchecks.io every 5mins.

I realize this isn’t a great fit (healthchecks.io is designed for deadman’s switch type alerting to track if a service is alive, not so much to track if an individual alert is firing)…but it turns out to work well for my use-case, and provides a single place to configure how alerts are directed.

Because healthchecks.io uses a unique URL for each alert (and a different alert for passing/failing cases), and I didn’t want to have to configure dozens of contact points, I wanted to use templating to build the webhook URL.

This is what I ended up with:
Heartbeat
Based on: Alerting: Add support for heartbeats in Grafana's internal alertmanager · Issue #76384 · grafana/grafana · GitHub

  1. create an ‘always firing’ alert. I use a prometheus alert with a query of ‘vector(1)’, set it up to fire every 5mins, and set a label ‘Heartbeat=1’. I configure it to have a state of ‘OK’ if data values are none or timeout occurs (really only possible if the prometheus db goes down). Note that this is an inverted alert. When it is firing, things are ok. when it is not firing (or state is OK) thins are bad. This is annoying because it means grafana will always show 1 alert firing, ut I don’t have a better solution right now
  2. setup a healthchecks.io check and grab the URL.
  3. configure a contact point as a Webhook, Using healthchecks.io, it looks something like: https://hc-ping.com/{{if ne .Status “firing”}}/fail{{end}}
  4. set a notification policy when label: Heartbeat=1 to use the contact-point from (3).
    now a firing alert will send a ping, and a recovery (or loss of signal) will result in a failure.

Alerts
For alerts, I want to be able to specify the slug (or unique id) for each healthchecks.io alert as a label parameter in the grafana alert, and have grafana send that to the proper URL. The ‘ExtendedDat’ template is available for Webhooks URLs: Reference | Grafana documentation

  1. setup a healthchecks.io check and grab the URL. I set the ping time to 365 days and the wait time to 365 days.
  2. create an alert. Set a label ‘Heathcheck=<healthchecks.io unique id>’
  3. configure a contact point as a Webhook, Using healthchecks.io, it looks something like: https://hc-ping.com/{{index .CommonLabels “Healthchecks”}}{{if eq .Status “firing”}}/fail{{end}}. Note this URL will be different depending on if you send a slug or the identifier.
  4. set a notification policy when label: Healthchecks is not “” to use the contact-point from (3).
    now a firing alert will send a request to the /fail endpoint, and a passing alert will send a ping to . This works fine except that if you don’t get an alert for more than 2 years, healthchecks will false-flag it as firing, and you need to go in and manually ping the node. You could also set each alert to ‘pause’ (it will still trigger properly), but then you need to go back and re-pause it every time it fires.