Hello, dears!
Please, help me to understand the strange behaviour of Loki.
Describe the bug
Small query with construction <regexp + metadata + regexp> in Grafana Explore overloads RAM and CPU await
{application=“app”} !~ “(?i)info” | filename = “E:/app/app.log” |~ “(?i)err”
RAM:
CPU:
But, occured that constructions below:
- {application=“app”} !~ “(?i)info”
- {application=“app”} !~ “(?i)info” |~ “(?i)err”
- {application=“app”} !~ “(?i)info” |~ “(?i)err” | filename = “E:/app/app.log”
- {application=“app”} | filename = “E:/app/app.log” !~ “(?i)info” |~ “(?i)err”
works properly - without any overloads by RAM/CPU
Only construction <regexp + metadata + regexp> triggers this problem.
In addition, in scanario below didnt stops the overload:
Cancellation of query (by timeout or manually) in Grafana, then OOM kill of Loki container - its restarting with and overloading again
Helps only manual restart of Loki container
To Reproduce
Steps to reproduce the behavior:
- Started Loki
- Started Grafana
- Started S3 - MinIO
- Query: {application=“app”} !~ “(?i)info” | filename = “E:/app/app.log” |~ “(?i)err”
Expected behavior
Overload by RAM/CPU await
Environment:
- Monolithic Loki deployment (in cluster of 3 nodes: 8 CPU/12G RAM, 4G Swap)
- Running in container (every Loki container limited by 6 CPU/10G RAM, 4G Swap)
- Logs stores in MinIO S3
You’ll usually want to filter the logs first (with both labels and structured metadata), before you do any processing, so that you don’t process data that you are already not interested in.
In your analogy, {application="app"} | filename = "E:/app/app.log" !~ "(?i)info" |~ "(?i)err"
is probably best in my opinion, assuming filename
is a structured metadata.
Yeah, i am agreed about sequence of query, but this query sequence was created automatically in Grafana Explore (builder mode) from low experienced engineer, who just wanted to see part of logs. As a result - prod was seariously freezed.
Maybe you can help me to understand:
- Thats really bug? Because, as i said before - only some sequence broke Loki.
- Its possible to block queries like this (with this sequince,i mean) on Loki(grafana maybe) site? Just want to avoid this.
P.S. I am still a little dissapointed, that query like this " 1. {application=“app”} !~ “(?i)info” |~ “(?i)err” | filename = “E:/app/app.log”" not consumed performance, but “{application=“app”} !~ “(?i)info” | filename = “E:/app/app.log” |~ “(?i)err”” - brokes Loki.
In my opinion, the first one should find by regexp much more data, as a result - consume more cpu/ram.
idk why, but screenshots in first post not loading now
RAM:
CPU:
- I don’t normally use builder, so I can’t comment on that.
- You can block queries, see Blocking Queries | Grafana Loki documentation
Containers using more memory isn’t necessarily a bug, it’s only a bug if you can prove that it’s not supposed to. If you are using monolithic mode you should expect it to run out of memory from time to time. One thing you can try is to try and increase the time out, and split the query (so it doesn’t use as much memory). It might take a big longer to run on a monolithic loki container, but might be less change to run into memory issue.
1 Like