How to get alerted when a Azure VM is shutdown

I need to get alerted when a VM is shutdown.

  1. So I tried to use event logs EventID = 1074 or 1076. Unfortunately not all servers are able to send this event log to grafana. Most of the times, the process that sends the logs, shut offs when the server is stopped and does not push this event log to Grafana at all. Hence checking on this eventID does not solve the purpose.
    Sample query:
    sum by(sub_env, application_acronym, computer) (count_over_time({sub_env=~“pd8|tr1”, channel=“System”, application_acronym=~“testapp”} | json | event_id = 1074 or event_id = 1076 [$__auto]))

  2. Next I tried to use prometheus metrics,
    absent_over_time(windows_system_system_up_time{application_acronym=“testapp”, sub_env=“lt8”}[$__interval])
    though this returns a value of 1, this does not tell me which servers are shutdown as none of the lables are emitted once the server is stopped. I cannot pick any other labels as they do not get emitted once the server is stopped.

  3. I cannot use Az monitor as a datasource as they our org does not want to tap into Az monitor as they are going to get away from that.
    So any other option that could help me would be great, IF anyone has any other working option do let me know please. TIA.

I guess there will be up metric for the job, which is collecting those windows metrics. But up metric indicates only if metric scrape was successful or not. It is not real a indicator that VM is down (there can be for example problem with network, … the same problem is with absent), but people are using it for that usually.
You can use it and group by instance and you will have almost desired result.

If you want to use absent, then:

Yah I already tried with up metrics and that wasn’t the accurate one either. It was misleading at times since the agent running on the machine actually went down, but then the up notified us - but it was not the actual VM shutdown as we wanted it to be..

And for the 1 PromQL query, do you have any example handy that I can refer to @jangaraj ? I am relatively new with PromQL hence wondering if there is a sample that I refer to , to help build this. Thanks again for your valuable suggestion.

But absent will have the same problem. “agent” will go down but VM will be up and running, metric will be missing, absent will alerting. I believe you need metric from hypervisor that some VM is down, when you must be precise.

Hmm yah I get it.. Ok let me do some more research on that and try other options, but thanks for your help in this discussion.