Container Exit alert

Hello,
I’m interested in sending myself an email when one of my containers exit. I’ve integrated Cadvisor and Prometheus to get metrics like container_last_seen, but am stuck on actually calculating if there’s less containers then there was, and how to pass the container name in the notification policy, if you could please help.
I’m trying this in the query (note I’ve tried to exclude a few that are almost always closed):
count(container_last_seen{container_name!~“nextcloud-aio-watchtower|nextcloud-aio-borgbackup|fgc”}) by (container_name)
problem here is the container name field isn’t correct. With the below annotations


I get some gobbly-gook in the alert

I see in the dashboard they got the name doing sum(rate(container_network_transmit_bytes_total{instance=~“$host”,name=~“$container”,name=~“.+”}[5m])) by (name) but can’t seem to replicate that with the dollar sign.

  1. The annotation template context does not have .Annotations. It looks like you try to refer one annotation in another. That does not work. You need to fix the template to use $labels.container_name
  2. According to the content of the email, I can see that rule query does not seem to return any labels at all, which looks strange unless you use Classic Condition, which aggregates all dimensions and then calculates checks against threshold. If you do not use classic condition, what do you see when you click “Preview” button on rule edit page?

Okay apparently I’d somehow got curly ‘“’ in there instead of ‘"’. So now the command is

count(container_last_seen{container_name!~"nextcloud-aio-watchtower|nextcloud-aio-borgbackup|fgc"}) by (container_name)

When I run preview I get 137 somehow, which is making it inaccurately fire because I think that would mean it’s saying 137 containers have exited (I only ever have 53 running).

since it sounded like I don’t need the annotation section I removed that

It was state normal so I closed a container and it pended and fired


but the email variable failed again

I started the container back up and unfortunately it continued to fire through more evaluation cycles