I want to analysis the top 10 alert duration every week.
Our alert history have output to Loki. And I can find the detail alert history from Loki with Grafana dashboard.
But how can I get the top 10 alert duration from 200 alerts? I prefer use Loki query to get the data and build a dashboard.
If Loki query is hard to implement it, can I get the alert detail info with API and do by myself? Thanks.
Hey @zpzkit, this is a great use case, and we aim to document this in the future but haven’t found the time yet.
Grafana Cloud also uses a Loki instance to store alert state history. Based on this, you can leverage similar queries to those used in Grafana Cloud Alerting Insights to analyze alert data.
Top 10 Most Fired Alert Instances - MostFiredInstanceTable
topk(10, sum by(labels_alertname, ruleUID) (count_over_time({from="state-history"} | json | current = `Alerting` [1w])))`
Percentage of Firing Alerts - FiringAlertsPercentage
sum(count_over_time({from="state-history"} | json | current="Alerting"[1w]))
Top 5 Most Fired Alert Rules by Group and Folder - MostFiredRulesTable
topk(5, sum by(group, labels_grafana_folder) (count_over_time({from="state-history"} | json | current = `Alerting` [1w])))
You might need to modify these queries to calculate durations instead of counts. Unfortunately, calculating precise durations with Loki alone might be tricky.
Hope this helps!