I have a Promtail + Loki digesting syslog input from a network of nodes. I would like to display the nodes that send in the most logs. My query currently is:
sum by (host) (count_over_time({job=“syslog”}[5m]))
If I wrap it in a topk statement the results make no sense, there are still more than 10 results.
topk( 10, sum by (host) (count_over_time({job=“syslog”}[5m])) )
The reason topk returns more results is because of the nature of how it works: topk is evaluated at each step, this means you can end up with more than 5 results as the top 5 results you are looking for are being evaluated at every step and may be different. I recommend to check out this detailed explanation on why is it happening and how you can resolve it:
You can also check out Loki dashboard on play site Grafana. Under Acquisition and Behaviour row we have couple of topk queries that you check out as well!
I have a new setup and a new Grafana (9.2.3). The queries found under Aquasition and behaviour still work in the play site, unfortunately cannot be replicated to a fresh grafana. Any ideas?