Hello. Please help me figure out the following problem: I use a bundle of telegraf+prometheus+grafana. In telegraf, I use several plugins to collect metrics from software such as postgresql, redis, etc.
Now I am setting up alerts in case of service unavailability, and I cannot understand how to do it correctly? The metric collector does not return availability metrics like “postgresql_up” and therefore I am trying to use the existing metrics for this task. The idea is this - when the metric stops updating, the alert should work. But I ran into a problem with labels: it is extremely important for me to see in the notification what host the alert is for. The simplest solution is to take a certain metric and wrap it with the absent function. But in this case, I lose the labels, and I cannot find out the source host from the notification. It would seem that there is a standard notification in the absence of data - but the problem is the same, there is no host label in the data generated by the alert. The second day I search on the Internet and ask AI - no result yet.
I ask for advice - how can I solve this problem?