LogQL query merging multiple result labels into one?

I am trying to write a Loki LogQL query that returns the log lines matching common error strings and extracting the offending log line. Then grouping the results by the cluster and app, so that Grafana would not spam me with many alerts for each matching log line.

My current query returns one result for each matched log line:
sum(count_over_time({app =~ "(app.*)",cluster="contoso1"} |~ "(?i)Exception|Error" | pattern [1m])) by (container,cluster,logSnippet)

The results from the query are like this, and I will get alerted by Grafana 3 times:

Cluster=contoso1, app=app1, logSnippet="Exception found:"
Cluster=contoso1, app=app1, logSnippet="Error 503"
Cluster=contoso1, app=app1, logSnippet="Server responded with an error."

I would like to group them and combine the individual logSnippets into one label that I can use in my Grafana alert template.

This is what I would like to achieve:

Cluster=contoso1, app=app1, logSnippet="Exception found: \n Error 503 \n Server responded with an error."

If I attempt to put my query inside of another sum() by (cluster, app) then I just lose the logSnippet labels…

Is there any solution to this?
Thanks!

You are essentially asking to merge multiple lines of logs into one, and I don’t think this is possible.

What you can do, however, and this depends on the alerting platform you use, is to generate alerts for each of them with the same ID. We use Opsgenie, and if you generate multiple alerts with the same alias (ID) but with different message it’s “updated”, so you’d get a history in the alerting platform on each message when it’s updated.

If this does’t work for you, I’d say probably break your alerts into types, maybe one type is error 503, one type os exception, and perhaps a generic error when the log contains “error”.

1 Like

Thanks for the input.
I was not able to find any way to do this in the loki query as well. I ended up using Grafana grouping to get alert group only for a specific container/pod. Then used Grafana templating to iterate and print all of those logSnippets in a for loop and then add remaining info at the end. So it looks like a single alert notification with all the logs and stack trace that caused the problem.