Grafana Alert expression without functions

The following is an alert I wanted to use on Loki:

      - alert: myservice_error
        expr: |
          {systemd_unit="myservice.service", instance=~"server-x-.*"} |~ "error|Error" !~ "Error while fetching receipt | Error sending milestone"
        labels:
          alerting: myservice_error
          severity: critical
          channel: mychannel
        annotations:
          summary: "myservice error"
          description: "myservice has error! Reported by instance {{ $labels.instance }} of job {{ $labels.job }}. Log: {{ $value }}"

When I query exactly that expr in Loki, it responds but somehow ignores the trigger alert. I must mention that it sends alerts if I use functions like sum or count_over_time in that expression.

So, is it impossible to create alerts based on logs directly? Or maybe the alert value has more than one response, and Loki’s alerting service is unhandled.

Thanks in advance!

You need some sort of condition, what would be the condition to create alert based on logs?

I would like to receive alert right after a log is in that stream, without any condition. The goal is to have log/json-log in alert message like Log: <logs here> and I used {{ $value }} there which is the value of that expr. If I use conditions for the expr, the {{ $value }} will be the numeric value and not the log itself.
Any idea here?

If you expect an alert to fire after the log is present, then your condition is count of said log > 0. I don’t think you can include the logline as part of error, but you can work around it by making the entire logline into a label and then aggregate on it.

For example, let’s say your alert expression is like this:

 sum (count_over_time({SOME_LOG} [1m])) > 0

You can do something like:

 sum by (logline, <OTHER_LBELS>) (count_over_time({SOME_LOG} | label_format logline="{{ __line__ }}"  [1m])) > 0
1 Like

Creative idea, it’s fixed the issue I had.
The new alert pattern is like:

- alert: myservice_error
        expr: |
          sum by (logline) (count_over_time( {systemd_unit="myservice.service", instance=~"server-x-.*"} |~ "error|Error" !~ "Error while fetching receipt | Error sending milestone" | label_format logline="{{ __line__ }}"  [1m])) > 0
        labels:
          alerting: myservice_error
          severity: critical
          channel: mychannel
        annotations:
          summary: "myservice error"
          description: "myservice has error! Reported by instance {{ $labels.instance }} of job {{ $labels.job }}. Log: {{ $labels.logline }}"

Thanks!