I’m trying to create an alert that monitors containers on a server. If one or more containers stop responding for ~3 minutes, I want to send an email notification indicating an issue. However, I’ve encountered a problem with the Node Exporter configuration—it only sends data for 5 minutes. After that, the value changes to “No Data,” and I receive an email stating that the issue has been resolved.
My question is: Is there a way to modify the expression so that the alert continues instead of resolving when the value becomes “No Data”?
Additionally, I want the email notification to include the name(s) of the non-responsive containers. For this, I use {{ $labels.name }}
. However, when using the absent
function or a classic condition, labels stop working, so this is not a viable solution.
For reference, I’m using Grafana version 10.2.4.
The query I’m using is:
time() - container_last_seen{instance=“server1”, name!=“”}
(The name!=""
condition ensures that container_last_seen
only applies to containers and not other services.)
I’ve tried various expressions, but after a few minutes, I always receive an email stating that the issue has been resolved.
Additionally, I’d like to ask if there’s a way to edit alerts that have been released via a pipeline. Currently, they are tagged as provisioning, and I can’t modify them. (I’m not using the API.)
Thanks in advance for your help!
Hi,
Can you check if you have Keep Last State
as an option in Alert state if no data or all values are null
option? If so, I think that would solve your problem (but I don’t remember where it appeared - maybe in 10.3 - is update an option?). It wouldn’t since the query would still return data, just not the series that disappeared 
Also, do you use Prometheus or VictoriaMetrics?
After some testing:
My question is: Is there a way to modify the expression so that the alert continues instead of resolving when the value becomes “No Data”?
In Prometheus it’s hard to do default on a missing time series (in VictoriaMetrics is just expression default 0
, that’s why I asked). Anyway, if your point is only to keep the alert from resolving, you can do what I scratched off - to keep your query and set Alert state if no data or all values are null
option to Keep Last State
(I checked on some query of my own that it keeps the alert even after it was not present in query). Possibly, you’ll have to update to Grafana 10.3, if that’s possible in your case. If not, the only (easy) possibility is to disable resolved message in your contact point which I guess is far from ideal.
Additionally, I want the email notification to include the name(s) of the non-responsive containers. For this, I use {{ $labels.name }}
Do you need help with that if the query doesn’t change from time() - container_last_seen{instance=“server1”, name!=“”}
?
Additionally, I’d like to ask if there’s a way to edit alerts that have been released via a pipeline.
What do you mean by that? What pipeline?