I’m using Grafana 10 and alerting on metrics coming from Prometheus. Is there any way to somehow get the duration that a metric has been violating the threshold, or the duration that an alarm has been firing, into the notification template?
I’m trying to create a custom notification template for some super simple notifications about real-world monitoring, intended for non-technical folks. For example, I’ve got a temperature sensor monitoring a storage room. I’d like to be able to send a notification that says (and only says, nothing else) “The temperature in room 123A has been too high for 3 hours” and then on re-notification it will be “4 hours”, “5 hours”, etc. I just have one metric, the temperature.
I know that Grafana notification templates have StartsAt
to template the starting time of the alert, but that’s a time not a duration, and it’s also the time that the alert started not the time that the metric first crossed the threshold.
I’ve also looked into some suggestions for similar things using changes()
in PromQL, but all of those seem to suffer issues with 24-hour maximum durations or issues handling multiple state changes in a given timeframe.
Any advice would be greatly appreciated, thanks!