How to migrate to multi dimensional alerts? There is no documentation

Hello,

I’m trying to migrate to multi-dimensional alerts from classic alerts. I have reviewed all existing documentation and videos but still don’t know how to create them. There is only a theory available and very basic examples, without real life examples and tutorials. Its so frustrating and time wasting.

How can I rewrite following queries in order to be able to use multi-dimensional alerts?

Containers has restarted more than 10 times in the last 24 hours or more than 4 times in the last 30 minutes
"count(rate(container_start_time_seconds{job=\"kubernetes-nodes-cadvisor\", container=\"${{SERVICE_NAME}}\"}[24h])) - sum(kube_deployment_spec_replicas{deployment=\"${{SERVICE_NAME}}\"})",

expr": "count(rate(container_start_time_seconds{job=\"kubernetes-nodes-cadvisor\", container=\"${{SERVICE_NAME}}\"}[30m])) - sum(kube_deployment_spec_replicas{deployment=\"${{SERVICE_NAME}}\"})",


"The percentage of allocated CPU resources being used (where 100 % is the maximum) 5 minute rolling average",

"sum(rate(container_cpu_usage_seconds_total{job=\"kubernetes-nodes-cadvisor\", container=\"${{SERVICE_NAME}}\"}[5m])) by (pod) /\nsum(container_spec_cpu_shares{job=\"kubernetes-nodes-cadvisor\", container=\"${{SERVICE_NAME}}\"} / container_spec_cpu_period{job=\"kubernetes-nodes-cadvisor\", container=\"${{SERVICE_NAME}}\"}) by (pod)",

${{SERVICE_NAME}} Memory Usage went above 80% in the past 5min",
"sum(container_memory_working_set_bytes{job=\"kubernetes-nodes-cadvisor\", container=\"${{SERVICE_NAME}}\"}) by (pod) / min(container_spec_memory_limit_bytes{job=\"kubernetes-nodes-cadvisor\", container=\"${{SERVICE_NAME}}\"}) by (pod)",

thanks in advance

Can you post screenshots of your query and reduce & math expressions, and what you see when you click Preview alerts?

Thanks for your reply. I don’t use reduce and math yet, because I’m not sure how to rewrite a query used in classic alert.
I’m posting an alert with classic condition used for one specific service. I would like to replace it in order to create multi dimensional alert which handles all services (containers).

Did you read /watch this? Grafana Alerting: Explore our latest updates in Grafana 9 | Grafana Labs

Yes, I do. Knowledge provided there is not sufficient for me.

You need to reduce expression A and expression B to single values. For example:

C:

D:

Then do a math expression (instead of Classic Condition):

E
image

Set alert condition to be on E - Expression.

Set the Alert Evaluation Behavior to your liking, then run the queries by clicking the blue button and then click Preview alerts. Paste the output here and me or someone can guide you through the alert templating.

My main problem was how to adjust a query to make it useful within multi-dimensional alert. I have figured it out finally. You can see the outcome on the screenshot. Its a bit slow because it checks 220 pods, perhaps I could split it somehow to 2 groups by 110 pods. Thank you for support.

1 Like