Hello, dears!
Please, help me to understand the strange behaviour of Loki.
Describe the bug
Small query with construction <regexp + metadata + regexp> in Grafana Explore overloads RAM and CPU await
{application=“app”} !~ “(?i)info” | filename = “E:/app/app.log” |~ “(?i)err”
Only construction <regexp + metadata + regexp> triggers this problem.
In addition, in scanario below didnt stops the overload:
Cancellation of query (by timeout or manually) in Grafana, then OOM kill of Loki container - its restarting with and overloading again
You’ll usually want to filter the logs first (with both labels and structured metadata), before you do any processing, so that you don’t process data that you are already not interested in.
In your analogy, {application="app"} | filename = "E:/app/app.log" !~ "(?i)info" |~ "(?i)err" is probably best in my opinion, assuming filename is a structured metadata.
Yeah, i am agreed about sequence of query, but this query sequence was created automatically in Grafana Explore (builder mode) from low experienced engineer, who just wanted to see part of logs. As a result - prod was seariously freezed.
Maybe you can help me to understand:
Thats really bug? Because, as i said before - only some sequence broke Loki.
Its possible to block queries like this (with this sequince,i mean) on Loki(grafana maybe) site? Just want to avoid this.
P.S. I am still a little dissapointed, that query like this " 1. {application=“app”} !~ “(?i)info” |~ “(?i)err” | filename = “E:/app/app.log”" not consumed performance, but “{application=“app”} !~ “(?i)info” | filename = “E:/app/app.log” |~ “(?i)err”” - brokes Loki.
In my opinion, the first one should find by regexp much more data, as a result - consume more cpu/ram.
Containers using more memory isn’t necessarily a bug, it’s only a bug if you can prove that it’s not supposed to. If you are using monolithic mode you should expect it to run out of memory from time to time. One thing you can try is to try and increase the time out, and split the query (so it doesn’t use as much memory). It might take a big longer to run on a monolithic loki container, but might be less change to run into memory issue.