I want to learn about the alert mechanism and how is the best way to set them. I know that I can not use alerts if I use variables. Yet, when you have 500+ node to monitor let’s say simply cpu usage 90% for 1 min, how do you set alerts? Of course there are many other metrics. Should I create different dashboards for alerts separated completely from the main monitoring dashboards?
I would really like to know how do you manage alerts and incidents in your environments.