A bit of background:
We have set up a new grafana install and have AWS Cloudwatch & Prometheus data sources added. I have imported a few dashboards and customised them as we would like.
One of which is this one:
On the dashboard the monitor I have set up is as follows -
probe_success{instance=~"$target", job="$App"}
our environment consists of various production, staging and test servers (their host names indicate which environment they are part of)
For example:
srv01-staging
srv01-production
I’m trying to create an alert to monitor the HTTP response for ONLY the production servers.
My alert code is as below:
probe_success{job = “nameofjob”}
My issue is that this will alert on ALL failures even on our staging/test environments which I do not want.
I don’t believe we can use variables in alerts - or if we can I havn’t been able to get it working.
TLDR:
What is the best way to segment alerts so that I am not notified of issues with our staging/test environments?
Have I worded my question in a confusing way? I would think this is quite widely done by other users, if not can anyone suggest an alternative way to achieve what I am attempting?