Hello,
Is it possible to use one alert for multiple servers?
Thank you.
Hello,
Is it possible to use one alert for multiple servers?
Thank you.
What do you mean by “multiple servers”? If datasources (like you have two prometheus instances), then probably not. But if you have e.g. two linux servers and want to monitor CPU usage of them both, then certainly can. You just need to make a Prometheus query (as long as you didn’t switch to something else) that returns two series, e.g. sum(<query>) by (instance)
will return as many series as there are instances.
Then by using Grafana’s expressions (namely Reduce and / or Threshold), you’ll get multiple series (click blue Preview
button under the expressions in alert creation screen) - each series will create separate alert.
Hi,
Thank you so much for your reply.
I mean job
. Suppose I have 10 jobs and I want to write an alert that will warn me when the CPU usage goes above 80%. Do I need to write a separate alert for each of these jobs?
I guess job
is a label from one metric from one datasource, right?
Then a query like sum(<CPU query>) by (job)
will do the same as the example with instance. As long as you’ll return multiple series and NOT use Classic Condition
you’ll have one alert per serie.
Thanks again.
So, does this cause the same heart symbol to appear in the section for each Job?
The heart symbol (I guess you’re referring to the one on a plot) is visible only on the panel that is “attached” to the alert. While creating an alert, you can pick a panel to link to the alert (step 5 - add annotations). (If you’ve created an alert from a panel, it is automatically linked). There’s no way to link multiple panels to one alert.
Thanks.
Suppose I have two jobs. Job1 and Job2. I create an alert for Job1 about CPU usage. How can I push this alert to Job2 as well? Do I have to manually create this alert for Job2 as well?
No, you should do something like:
job
filterE.g. now you have cpu_usage{job="job1"}
, so do avg(cpu_usage{}) by (job)
Thank you so much.
My alert rule is:
100 - avg by (instance) (irate(windows_cpu_time_total{mode="idle", instance=~"192.168.1.3:9182"}[1m])) * 70
What is the correct form of this query?
I guess you should be fine with
100 - avg by (instance, job) (irate(windows_cpu_time_total{mode="idle", instance=~"192.168.1.3:9182"}[1m])) * 70
But notice that you’re already passing a set instance and I’m not so sure the same instance will be in two jobs, so I’d do:
100 - avg by (instance, job) (irate(windows_cpu_time_total{mode="idle"}[1m])) * 70
The question is - do you have two different jobs for windows metrics?
Hi,
Thanks again.
I have 10 jobs and each of these jobs is a Windows server. I don’t want to write an alert for each of these jobs separately.
Where should I write the following alert and how do I apply it to all jobs?
100 - avg by (instance, job) (irate(windows_cpu_time_total{mode="idle"}[1m])) * 70
You just create an alert rule.
how do I apply it to all jobs
There’s no such thing as “applying” an alert - you can either:
That being said, I can think of those two meanings of “applying”. If you meant something else, let me know
Thanks again.
I want to write an alert for the CPU Utilization panel and have this alert work for all jobs.
How do I do this, @dawiddebowski?
Just curious, which version of Grafana are you using @hack3rcon ?
Not that is really matters, you could achieve this with both “old” style panel alerts and the new Grafana alerts that are basically Prometheus alerts. Based on the alerting rule, you are doing Prometheus alerting. I don’t know if those show up as hearts on panels. I really only use Prometheus for alerting so I don’t know…
Anyway, here is an example alerting rule expression
As long as the metric name is the same for all jobs, which it should be, the one alerting rule will work for all jobs. Grafana should send out an alert for all series that trigger the rule. In Alertmanager you then configure the alert grouping which controls how many alerts you will receive. Do you want one alert no matter how many series triggered the alert or e.g. one alert per instance
label.
Hello,
Thank you so much for your reply.
I am using Grafana 11
on Debian. When I create an alert for a panel, a heart symbol appears at the top of the panel.
I want to have an alert for each instance
. Is it possible for you to tell me step by step how I can achieve my goal?
I think this question will be useful for many new users like me. You have 100 jobs and you want to monitor the CPU usage for each of these jobs. One way is to write a CPU alert for each of these jobs, which is time consuming. Another method is to write an alert for CPU usage and apply it to all jobs.
Since Grafana 8 alerts and panels are unrelated - one can exist without another. You can link them together and point that this specific alert points to this specific panel (AFAIK it’s one to one, not many to many relation). You can create an alert that will monitor every job / instance / whatever you like with the query I provided earlier. Look at the screen below:
One query will create multiple alerts, one for each time series.
BUT YOU CANNOT LINK THIS ALERT TO MULTIPLE PANELS (so e.g. if you have a panel for every instance you cannot get the heart for every panel from one alert). Even though it will monitor each of your instances. It will cause problems only if you have set up image renderer and want to get screenshots of specific panel - you cannot do that yet (AFAIK).
Hi,
Thanks again.
If you have 100 Windows servers to monitor (100 jobs) and now you want to write an alert for the CPU usage of each of these jobs, what method would you choose? Do you write an alert for each job?
Exactly the way I mentioned the message before - one query alert that would create alerts per each instance. If you want to link it to some panel, I’d do a single panel for all those instances (the same query as in alert but in panel I would add > <threshold>
to avoid having too many series and making the chart unreadable).