I have a bunch of alerts based on Prometheus metrics (in this case for a NAS) that are working just how I’d want - if e.g. volume utilization is over X%, I get an alert.
However, once a month there’s an automated volume rebuild that causes utilization to spike very high, and I don’t want to alert when this is happening. The rebuilding is denoted by a second metric.
with a mean/strict reducer and an alert condition of threshold > 75. I would like to keep doing that, but ONLY alert if another metric exceeds another threshold - i.e. only alert if the above query is > 75 AND synology_volume_is_rebuilding is > 0.
Is it possible to do this? If not, is there any other built-in way that I can automatically suppress alerts when a given (unrelated) query meets some threshold?
Just for what it’s worth, it appears that if using Prometheus’s Alertmanager, there’s something called “inhibit rules” that could be leveraged to get my desired end-result, but those obviously don’t work with Grafana Alerting’s built-in notification system. It looks like the feature request for something like that in Grafana has been closed.
As far as I can tell, the only way to achieve this outcome right now with Grafana is to have a completely separate custom daemon that receives webhook notifications for one alert, and then creates and deletes silences based on that. Which seems feasible but a LOT of work to achieve something that’s been present in many other monitoring/notification systems for decades.
It should be possible with set operators - and or unless. See example on the screen below - show vector(1) unless rate of some metric is more than 0.02 (totally random data).