I write a log to Loki every time the state of a systemd service changes. I would like to use these logs to build a Grafana dashboard that shows the current (latest) state of every service.
I can use a range query like this to retrieve all log entries within a certain time period:
Now the problem is filtering those log entries to only include the most recent line per unique (host_name,systemd_unit_name) label set.
I want to write something like this, but the latest aggregation function doesn’t exist:
latest( {service_name="systemd-opentelemetry-monitor"}
| line_format `{{.host_name}}/{{.systemd_unit_name}}: {{.systemd_unit_active_state}}`
) by (host_name,systemd_unit_name)
I experimented with Grafana data transformations but also couldn’t find a way to do this. I also tried using last_over_time, but it would need to unwrap the systemd_unit_active_state field, and string fields are not supported. Perhaps I could use the label_format expression to convert the status to a numeric value and then back again, but this is rather verbose and clunky.
The only latest function I can find is last_over_time, which only works for metrics query. I would recommend you to change your log format (if possible), to include both a status string and some sort of ID (you’d have to make this up and keep it consistent in your code).
Example, let’s say running = 0 and failed = 1 (you can add other status too, and even a catch-all status like unknown = 99 for example), you’d change your log to something like this (and I generally recommend JSON because it’s standard and structured, but this is up to you):
I found eventually that this was a good fit for the state timeline view in Grafana, which nicely sidesteps the need to get latest values (and I was able to make it work with a few transformations).