Maximise returns on a LogQL query


I have a Promtail + Loki digesting syslog input from a network of nodes. I would like to display the nodes that send in the most logs. My query currently is:

sum by (host) (count_over_time({job=“syslog”}[5m]))

If I wrap it in a topk statement the results make no sense, there are still more than 10 results.

topk( 10, sum by (host) (count_over_time({job=“syslog”}[5m])) )

Anyone that has this working properly?

The reason topk returns more results is because of the nature of how it works: topk is evaluated at each step, this means you can end up with more than 5 results as the top 5 results you are looking for are being evaluated at every step and may be different. I recommend to check out this detailed explanation on why is it happening and how you can resolve it:

You can also check out Loki dashboard on play site Grafana. Under Acquisition and Behaviour row we have couple of topk queries that you check out as well!

Many thanks, especially the Loki dashboard on the Grafana site is very impressive and helpful.

I know I know, comparing an apple with an orange here. What I tried to create looks something like this.