Hi there !
I’m unable to achieve decent perfomances with Loki 2.7.1 (binary install) and need some help to troubleshoot where the problem is. I don’t know if the query slowlyness I’m facing is due to the loki config, the hardware requirements or the query itself.
Here is an example query I use to investigate with logcli --since=168h that reveal poor stats (with cache disabled for tests):
count by(isApp) (count_over_time({env="production", artifact="MY_SERVER_NAME", isApp=~"true|false"} [1h]))
And here are the results stats:
Ingester.TotalReached 16
Ingester.TotalChunksMatched 8
Ingester.TotalBatches 15
Ingester.TotalLinesSent 95
Ingester.TotalChunksRef 83
Ingester.TotalChunksDownloaded 83
Ingester.ChunksDownloadTime 4.579255398s
Ingester.HeadChunkBytes 77 B
Ingester.HeadChunkLines 7
Ingester.DecompressedBytes 2.9 kB
Ingester.DecompressedLines 92
Ingester.CompressedBytes 4.0 kB
Ingester.TotalDuplicates 2
Querier.TotalChunksRef 8856
Querier.TotalChunksDownloaded 8856
Querier.ChunksDownloadTime 37.078656774s
Querier.HeadChunkBytes 0 B
Querier.HeadChunkLines 0
Querier.DecompressedBytes 311 kB
Querier.DecompressedLines 9837
Querier.CompressedBytes 434 kB
Querier.TotalDuplicates 0
Summary.BytesProcessedPerSecond 86 kB
Summary.LinesProcessedPerSecond 2710
Summary.TotalBytesProcessed 314 kB
Summary.TotalLinesProcessed 9936
Summary.ExecTime 3.665225301s
Summary.QueueTime 0s
Firstly, don’t you think these are poor stats ? 3,6 secondes to process less than 10K lines of 311kB, that’s not the stats I can see online, even when the question of slowlyness is set.
I think I missed something.
Here is my Loki config:
auth_enabled: false
server:
http_listen_port: 3100
grpc_listen_port: 9096
http_server_read_timeout: 3m
grpc_server_max_recv_msg_size: 15194304
grpc_server_max_send_msg_size: 15194304
common:
path_prefix: /tmp/loki
storage:
filesystem: null
gcs:
bucket_name: MY_BUCKET_NAME
service_account: MY_BUCKET_SERVICE_ACCOUNT
replication_factor: 1
ring:
instance_addr: 127.0.0.1
kvstore:
store: inmemory
query_range:
align_queries_with_step: true
max_retries: 5
parallelise_shardable_queries: true
cache_results: true
results_cache:
cache:
embedded_cache:
enabled: true
max_size_mb: 512
frontend_worker:
frontend_address: localhost:9096
parallelism: 4
grpc_client_config:
max_send_msg_size: 1.048576e+08
schema_config:
configs:
- from: 2020-10-24
store: boltdb-shipper
object_store: gcs
schema: v11
index:
prefix: index_
period: 24h
storage_config:
boltdb_shipper:
active_index_directory: /data/loki/index
shared_store: gcs
cache_location: /data/loki/index_cache
As you can see, my logs are store in a GCS bucket, without slowlyness issue at writing.
My Loki instance (and Grafana) are installed on a small machine, do you think it can be the root cause (the CPU usage seems good during the slow queries):
- Ubuntu 20.04 x86/64
- 50GB SSD
- GCP Instance type e2-standard-2 (2 vCPU, 8GB Memory)