One Alert rule with severity/thresholds: warn, critical etc

Hi,
it is a use case so common in alerting that i am embarassed to ask. But i could not find a solution.
Grafana 9.1.2
DataSource: InfluxDB Flux
Use Case:
Cpu usage: field.
80% → fire an alert with a severity warning.
90% → fire an alert with a severity critical.
I thought i could solve it using a multidimensional alert.
My strategy was to create a dynamic label, based on the cpuUsage value. Unfortunately
the labels are created only from the tags, no way to calculate them.
Has somebody found a smart workaround?
I am really surprised an alerting manager is forcing the user to create two alert rules one for warning severity and another one for critical.

regards,

Hi @lucazago
Multidimensional alerts in your case would mean you would create one rule to govern, for example, 5 different CPUs.

Re: the warning when something breaches 80%, that would be one alert. If the same thing breached 90%, then that would a separate alert. I would agree this is not ideal. If my understanding is not correct, perhaps @georgerobinson can correct me.

In case you are using Flux, this may be helpful.

@grant2 yes i studied that article attentively, it is a gold mine of information.
Unfortunately the multiple threshold example, doesn’t apply here.
My idea was to use alert instances ‘labelled’ dynamically based on a calculated severity label. In this way (as in many alertmanager) i have one rule (one query… why should i do two queries on the same field in two rules) managing several severities.
using the map function in Flux, exactly as you are doing in the quoted article with the ‘difference’ field.
But any map field, in general any kind of field, could not be ‘transformed’ in a label.

Hi @lucazago! You should be able to do this using a single alert rule and a templated severity label in v9.1.2.

Assuming a pretty standard setup of A: Query, B: Reduce expression, C: Math expression

then the label would look something like this:

severity={{if gt $values.B.Value 2.0 }}critical{{else}}warning{{end}}

Notes:

  • Pay particular attention to the decimal precision of the 2.0, GO is particular about its types and just using 2 could cause an error.
  • If using a classic condition in v9.1.2, the variable would instead be $values.B0.Value

Hi @mjacobson, yes this is a great solution, i didn’t know i could template the custom labels field. it should work. I think I have to add an additional notification policy matching the severity, because the expected behavior is that the user receives a warning notification and a further critical notification when the critical threshold is passed.

Hi @mjacobson

This is also new to me. So in my case (still using v9.2), I currently have this…

which fires alerts when $B is above 25.0.

If I wanted to apply the severity label as you outlined to fire the alert when the alert goes above 20, 25 and 30, how would I do that? Like this? How should I modify Expression C?

severity={{if gt $values.B.Value 20.0 }}critical{{else}}warning{{end}}
severity={{if gt $values.B.Value 25.0 }}critical{{else}}warning{{end}}
severity={{if gt $values.B.Value 30.0 }}critical{{else}}warning{{end}}

@grant2 i try not to answer but to understand better, i just tested the above solution though.
The alert condition is one, in your case > 25. So the notification is going to be sent only in this case, then if the value is 22 for instance the condition
severity={{if gt $values.B.Value 20.0 }}critical{{else}}warning{{end}} doesn’t trigger any notification.
then why do you need
severity={{if gt $values.B.Value 30.0 }}critical{{else}}warning{{end}} ? it is always the same severity label as for the case > 25

The above was a copy/paste error on my part. In reality, it would be something like this:

severity={{if gt $values.B.Value 20.0 }}low{{else}}warning{{end}}
severity={{if gt $values.B.Value 25.0 }}medium{{else}}warning{{end}}
severity={{if gt $values.B.Value 30.0 }}high{{else}}warning{{end}}

But I still do not know what I would do with expression C (which is $B > 25.0), and how that relates to the custom severity labels.

now it makes more sense. I have implemented my solution this way, I tested and it works, if it can help:
use case: i need to receive two notifications, when the cpu usage is > 80 (warning) and another notification when cpu usage > 90 (critical).
so my expression $C is $B > 80, the alert is in alerting state a first notification is sent and the label is set correctly by the magic template

severity={{if gt $values.B.Value 80}}warning{{else}}critical{{end}}

the first notification will have the severity label set on warning.
now the alert is already in alerting mode… so to have the second notification i created an additional notification policy, matching severity=critical. When the alert will be evaluated again and the cpuusage > 90 i got the correct alert with the correct severity.
in your case, i don’t think you are notified when severity is low, because the alert condition is > 25
severity={{if gt $values.B.Value 20.0 }}low{{else}}warning{{end}}

1 Like

@lucazago

:+1: Thanks for the detailed explanation. I’ll test on my end with my alerts, but am pretty sure it will work just as you outlined.

I did not see this specific use case mentioned in the recent changes in the Alerting documentation (here and here) so am glad it’s now documented here.

This works really well for me if I construct the alert in the UI.
However if I do the same in yaml for file provisioning then $values.B gets trimmed to .B and I cannot figure out why. Everything looks like it should!

labels:
  severity: '{{if lt $values.B.Value 14.0 }}low{{else}}warning{{end}}'

Any bright ideas?

Hi @andyking,

However if I do the same in yaml for file provisioning then $values.B gets trimmed to .B and I cannot figure out why. Everything looks like it should!

This is because the $values is being interpreted as an environment variable meant to be interpolated. You should be able to fix this by using $$ instead of $. See Provision Grafana | Grafana documentation

1 Like

Thank you for the reply. That was it :slight_smile:

Hi @mjacobson,

Are you aware about more env variables like $value, that we can use in custom labels?

Thanks

So where does notification policies come into this?

This works great up to the point where only the last label ‘severity’ is accepted. The alert rule will drop the first two and only save the last entry. Running v10.1.4
Work around is having 3 separate labels (severity1, severity2, severity3), which defeats purpose, and adds another layer of complexity for alerting when integrating to OpsGenie (as an example).

1 Like