Hi,
I’m displaying in Grafana some logs which are stored in Loki. When time range is a high, a month for example, it takes 10 seconds approximately to display the histogram.
I’ve checked Loki’s logs and I’ve seen that the query to get the histogram’s data makes 721 subqueries:
level=info ts=2022-10-26T16:25:16.873097279Z caller=metrics.go:133 component=frontend org_id=fake latency=slow query=“sum by (level) (count_over_time({XXXX="YYYY"}[1h]))” query_type=metric range_type=range length=720h0m0s step=1h0m0s duration=10.263488615s status=200 limit=1000 returned_lines=0 throughput=450kB total_bytes=4.6MB total_entries=1 queue_time=4m3.467393379s subqueries=721 source=logvolhist
Therefore, it might make sense to take so long if query is making so many subqueries.
I’ve done the same query with LogCli:
./logcli-linux-amd64 query ‘sum by (level) (count_over_time({XXXX=“YYYY”}[1h]))’ --since=720h --stats
In this case the execution time is faster, 3 seconds instead of 10 seconds:
Ingester.TotalReached 4016
Ingester.TotalChunksMatched 0
Ingester.TotalBatches 2
Ingester.TotalLinesSent 2
Ingester.TotalChunksRef 2
Ingester.TotalChunksDownloaded 2
Ingester.ChunksDownloadTime 833.59µs
Ingester.HeadChunkBytes 0 B
Ingester.HeadChunkLines 0
Ingester.DecompressedBytes 13 kB
Ingester.DecompressedLines 2
Ingester.CompressedBytes 2.7 kB
Ingester.TotalDuplicates 0
Querier.TotalChunksRef 0
Querier.TotalChunksDownloaded 0
Querier.ChunksDownloadTime 0s
Querier.HeadChunkBytes 0 B
Querier.HeadChunkLines 0
Querier.DecompressedBytes 0 B
Querier.DecompressedLines 0
Querier.CompressedBytes 0 B
Querier.TotalDuplicates 0
Summary.BytesProcessedPerSecond 3.8 kB
Summary.LinesProcessedPerSecond 0
Summary.TotalBytesProcessed 13 kB
Summary.TotalLinesProcessed 2
Summary.ExecTime 3.346525198s
Summary.QueueTime 1m19.192162833s
One difference that I’ve noticed is that with logcli the step
parameter’s value is higher:
level=info ts=2022-10-26T16:39:24.608248499Z caller=metrics.go:133 component=frontend org_id=fake latency=fast query=“sum by (level) (count_over_time({aplicacion="YYYY"}[1h]))” query_type=metric range_type=range length=720h0m0s step=2h52m48s duration=3.346525198s status=200 limit=30 returned_lines=0 throughput=3.8kB total_bytes=13kB total_entries=1 queue_time=1m19.192162833s subqueries=251
So I consider that increasing the step’s value, less subqueries are made and execution time is better.
Bearing in mind that only 30MB of logs are stored in loki, I wonder if there is a way in Grafana to improve the query for the histogram. Maybe configuring a different value for the step? But I haven’t seen any way to do that. Even I’ve considered to disable the histogram in Grafana but this is not available in the Grafana version I’m using (9.1.7) .