Latency when visualizing data for multiple days with Grafana

Hello,

We’ve deployed a beefy (100+ cores, 124GB RAM, 1.8T of SSDs for storage) test server with loki/minio and grafana to test loki and its performance.
For now, we only installed loki in a monolithic way, with the RPM, along with Grafana.

My issue is that data visualization on Grafana is quite slow. looking at a rate of all logs for 7 days can take 30s for grafana to load on explorer view.

So I tried to test loki performance with logcli tools, and got these results :

[bla]# time /usr/local/bin/logcli query --stats --since=168h 'rate({filename="/var/log/squid/access.log"}[1m])' > file.txt
http://<ip>:3100/loki/api/v1/query_range?direction=BACKWARD&end=1668017851935752943&limit=30&query=rate%28%7Bfilename%3D%22%2Fvar%2Flog%2Fsquid%2Faccess.log%22%7D%5B1m%5D%29&start=1667413051935752943
Ingester.TotalReached 		 80
Ingester.TotalChunksMatched 	 4
Ingester.TotalBatches 		 13067
Ingester.TotalLinesSent 	 6686238
Ingester.TotalChunksRef 	 162
Ingester.TotalChunksDownloaded 	 162
Ingester.ChunksDownloadTime 	 591.392749ms
Ingester.HeadChunkBytes 	 430 kB
Ingester.HeadChunkLines 	 2092
Ingester.DecompressedBytes 	 1.5 GB
Ingester.DecompressedLines 	 6697519
Ingester.CompressedBytes 	 201 MB
Ingester.TotalDuplicates 	 0
Querier.TotalChunksRef 		 4340
Querier.TotalChunksDownloaded 	 4340
Querier.ChunksDownloadTime 	 11.741086355s
Querier.HeadChunkBytes 		 0 B
Querier.HeadChunkLines 		 0
Querier.DecompressedBytes 	 57 GB
Querier.DecompressedLines 	 252310768
Querier.CompressedBytes 	 4.7 GB
Querier.TotalDuplicates 	 0
Summary.BytesProcessedPerSecond  4.1 GB
Summary.LinesProcessedPerSecond  18190820
Summary.TotalBytesProcessed 	 58 GB
Summary.TotalLinesProcessed 	 259010379
Summary.ExecTime 		 14.238520823s
Summary.QueueTime 		 5m10.55325166s

so apparently 4.1 GB processed per second is not that bad, so I’m not sure what to look next to improve this.
Grafana takes double the time for the same request (~30s)

My ideas :

  • loki is not meant for this kind of workload ?
  • Grafana is not using loki at full potential, limiting the query somehow ?
  • chunks not optimized ? label count is very low (too low?) we have only hostname and filename as labels and we log less than 10GB a day.

Thank you for your help and for any link or ideas I could explore to further dig into loki/grafana performance and optimization !