Grafana Loki: top 5 of syslog senders

Hello Team:
I was requested to build a Grafana panel showing a list of the top 5 hosts sending syslog messages to Loki in a period of time.

The best I could achieve is shown in the following image: it shows a list of ALL the hosts sending logs in a period of time, but I do not know if I can (1) limit the display to the hosts with the 5 biggest results and (2) do this in an ordered way

My query to Loki in the Grafana panel is the following. I wonder if some of the options in the Transform section would help me to achieve the goal.

sum by(hostname) (count_over_time({job=“syslog”} [$__interval]))

Hints will be greatly appreciated!!!
Thanks a lot

Rogelio

You should be able to do that with topk

Try something like

More info: Metric queries | Grafana Loki documentation

Hello b0b. Thanks for the hint! I lost connection to the site; as soon as I get access again I will modify the query and see what happens.

Best regards, Rogelio

Hi Bob:
I added the “topk” operator as suggested, but the output remains more or less the same.

All hosts still appear in the panel; I just noticed that the numeric results changed in the righthand side of each bar.

Can you post a new screenshot on your graph? Just want to confirm that topk actually works for you in terms of returning only the top X results.

In terms of sorting, I don’t actually know if LogQL supports sorting yet. But you can just let Grafana Table panel takes care of that for you by clicking on the column header:

Screenshot 2024-01-05 at 8.59.37 AM

Hello Tony:
This is the output with the original query (without topk). I see host Q-001 then Q-002 then A-001 and so on.

This is the output with topk included. Now I see host Q-001 then A-001 then Q-012 and so on. The order of hosts has been altered as well as the numbers on the righthand side (in the original output Q-001 was 12 now Q-001 is 5):

In my panel I do not see a column header from which to order the results as you show in your image. ¿ is that a feature I need to activate in the visualization options?

I am not entirely sure, what type of panel are you using? I am using table panel, and it seems to just come with it.

Hi Tony:
I am using a Bar Gauge. I will change right now to table panel and see what happens. I will keep you posted.

A Table does not seem to help. In fact I loose the representation based on hosts and results.

I’d encourage you to take a look at the Nginx dashboard on grafana playground, I copied some of my graphs from there and there are many other outstanding dashboards there that you’ll no doubt get some ideas from.

That said, if you’d like, give me some of your actual log lines (doesn’t need to be real, just the structure). I’ll put it in my Loki cluster and see what I can do.

Hello Tony: Thanks for the offer, it is much appreciated.

I think the solution to my request, at least in Loki, is more complex than I expected.

I will record a set of logs during the weekend in my log collector and see if I can send them to you on Monday.

Hello Tony:
Support __range_s variable in the Loki query editor. · Issue #25561 · grafana/grafana · GitHub gave me a hint to understand the nature of the problem.

See below: my original request includes “$_interval” as input in the query. When I use it, the query returns more than 5 results:

If I change the time range to something constant, like 5 minutes, then the output is 5 results!

The aforementioned URL says that “… $__interval is supported, but this returns more results than the limit set in the topK, which is correct but counter-intuitive and not what is needed by the user. …”

I will have to work this out for a long time I believe…

Best regards.
Rogelio

I do not understand the numbers I get in the output. None of these hosts are sending 100K logs in 5 minutes. At most one per second. Each bar should show something well below 300.

I do not know which operation is being carried out and shown in the righthand side of each bar.

The “Calculation” parameter in the Visualization column is influencing the number in the output. I was using “Total” and the output was those wierd numbers. I changed to “Last” and now the numbers are more correlated with the pace of logs being sent by these devices…

Yes, it’s important to be mindful of the nature of the data you are trying to get.

In this case, you are trying to get an aggregated view of count over a period of time, so you don’t actually need a time series data. So you want to:

  1. Make sure in query option you set it to return 1 data point only.
  2. Change query type to instant instead of range.
  3. Make sure to use $__auto in your query interval.

And if you have a table dashboard, use transformation to hide timestamp, and set the colume that shows number to the type of gauge, then you’ll get a nice presentation like my screenshot above.

I had a discussion with someone else on a similar topic, I have a reply there with more details that may be of help to you: Cannot distinct data in Grafana Loki - #5 by elabkevin

Thank you!
I will be working this out a bit more.

Best regards, Rogelio

Hi, do you know how to using python send log with format json to loki?

sorry i fogot example:
{“message”:"this is message, “New_key”: {“key1”: “value1”, “key2”: “value2”}}}
i want send new field “New_key”: {“key1”: “value1”, “key2”: “value2”}} then loki can using this json make field as label

Hello!
I regret I can´t help with this.

In my case I use Promtail to get the logs from network devices (actually, I need to have Syslog-NG between the devices and Promtail). And it is within Promtail whereby I detect the patterns and create labels.

This is a piece of Promtail´s configuration that I managed to tune in order to add as many labels to my logs as needed for processing in Loki; please see the “regex” and “labels” stanzas:

scrape_configs:

  • job_name: syslog
    syslog:
    listen_address: 0.0.0.0:1514
    idle_timeout: 60s
    label_structured_data: no
    labels:
    job: “syslog”
    relabel_configs:
    • source_labels: [‘__syslog_message_hostname’]
      target_label: ‘host’
      pipeline_stages:
    • match:
      selector: ‘{job=“syslog”} |= “CISCO”’
      action: keep
      stages:
      - regex:
      expression: ‘.*hostname=(?P[A-Z0-9_-]{16}).*devicetype=(?P[A-Z0-3]{2}).*country=(?P[A-Z]{2}).*site=(?P[A-Z_-]{9}).*vendor=(?P[A-Z]+).model=(?P[A-Z0-9_-]+).version=(?P[a-zA-Z0-9().]+).%+-.
      - labels:
      hostname:
      devicetype:
      country:
      site:
      vendor:
      model:
      version:
      loglevel:
    • match:
      selector: ‘{job=“syslog”} != “CISCO”’
      action: drop

According to the experts, the Grafana agent from Grafana Labs can also tail/scrape logs and add labels, so you don’t always need promtail