I want to send notification if status (based on bussines logic) of device metric is higher then 1, but I want to have exact metric value in notification too.
Example:
I have device category A, B, C - each category has own tresholds for statuses.
Then I have devices AA, AB in category A and BA, BB in category B and CA, CB in category C.
From postgres I can get table with time, device, device category, metric, status and value.
And if the status is higher than 1 I want to send notification with status and original value.
time
device
category
metric
status
value
2024-10-15 15:56:00+02:00
AA
A
m1
1
10
2024-10-15 15:56:00+02:00
AB
A
m1
1
20
2024-10-15 15:56:00+02:00
BA
B
m1
2
10
2024-10-15 15:56:00+02:00
BB
B
m1
3
20
2024-10-15 15:56:00+02:00
CA
C
m1
1
10
2024-10-15 15:56:00+02:00
CB
C
m1
4
20
2024-10-15 15:56:00+02:00
AA
A
m2
1
100
2024-10-15 15:56:00+02:00
AB
A
m2
1
200
2024-10-15 15:56:00+02:00
BA
B
m2
2
10
2024-10-15 15:56:00+02:00
BB
B
m2
3
20
2024-10-15 15:56:00+02:00
CA
C
m2
1
50
2024-10-15 15:56:00+02:00
CB
C
m2
4
100
Is it possible to do it with one alert rule?
Each numeric status has it’s own textual representation, it will be nice to send textual representation in notification, but it is not necessary.
As far as I know, you can do that. You create a threshold on your numerical value and then you can use templates in your annotations (description / summary) to insert the exact value (I’ve seen it was possible but myself I’ve never played around with that, so it might come with some trials on your end).
What do you mean with numerical value? Is it status value or exact metric value?
If it is status value, how I can pass the exact metric value? As a string, it creates label for each value and not send resolve message, because for combination of labels it is always firing or ok, because of value in label.
If it is exact metric value, then the threshold is not exactly defined globally, each device category has different threshold (sometimes it is depend on another metric). And here I think there is same problem with resolve message, because of label status.
It means that in the alert notification, you will receive the exact metric value and as well as any other info you are capturing (incl. string labels), using a template like this:
CPU usage for {{ index $labels "instance" }} has exceeded 80% ({{ index $values "A" }}) for the last 5 minutes.
which would look like this when triggered:
CPU usage for Instance 1 has exceeded 80% (81.2345) for the last 5 minutes.
For the above, it sounds like multi-dimensional alerts would work.
Thank you for your response, I think I undestand this, but the problem is in step before. How the query result should look like to set if it is alert or not by column status and keep original value?
Maybe I cannot explain. So here is my actual setup only with status, I am not able to add the exact metric value, because if I added it to query it any create new time series which has same labels but are independat to the status, so then the Reduce and Math is used to exact value too, but it is not what I need. How I can add exact metric value to query and deal with more numeric values in reduce and math?
When it is converted to time series it is quite sparse:
(I cannot use table mode, because there is error and I am not able to get rid of it. Error is [sse.readDataError] [A] got error: input data must be a wide series but got type long (input refid))
But the value is status, and I missing the original value.
When I add original value as number to query result, it creates from 24 time series 48 time series and reduce and math is applied to status and original value independently.
When I add original value as varchar to query, it creates new labels every time (because new original value), so the resolve message never come, because there is every time new group of labels and the number of dimension will grow to infinity.
So how should look like the query and calculations? If it is alert is based by column status and in message should be the original value too.