Hi,
We’re experiencing very slow query performance with loki 2.4.1
We are working in a distributed configuration over k8s with 6 queriers and 2 query frontends.
Our stats look like this:
Ingester.TotalReached 16
Ingester.TotalChunksMatched 0
Ingester.TotalBatches 0
Ingester.TotalLinesSent 0
Ingester.HeadChunkBytes 0 B
Ingester.HeadChunkLines 0
Ingester.DecompressedBytes 0 B
Ingester.DecompressedLines 0
Ingester.CompressedBytes 0 B
Ingester.TotalDuplicates 0
Store.TotalChunksRef 12227
Store.TotalChunksDownloaded 50
Store.ChunksDownloadTime 2.862452622s
Store.HeadChunkBytes 0 B
Store.HeadChunkLines 0
Store.DecompressedBytes 120 MB
Store.DecompressedLines 64611
Store.CompressedBytes 21 MB
Store.TotalDuplicates 0
Summary.BytesProcessedPerSecond 8.8 MB
Summary.LinesProcessedPerSecond 4711
Summary.TotalBytesProcessed 120 MB
Summary.TotalLinesProcessed 64611
Summary.ExecTime 13.714739257s
I’ve seen blog posts with throughput of 4GB/s so definitely there’s an issue here.
CPU and memory both look fine.
Configuration is this (removed URLs of course):
auth_enabled: false
chunk_store_config:
max_look_back_period: 0s
server:
http_listen_port: 3100
grpc_server_max_recv_msg_size: 64000000
distributor:
ring:
kvstore:
store: memberlist
memberlist:
join_members:
- loki-distributed-memberlist
ingester:
lifecycler:
ring:
kvstore:
store: memberlist
replication_factor: 1
chunk_idle_period: 3m
chunk_block_size: 262144
chunk_encoding: snappy
chunk_retain_period: 1m
max_transfer_retries: 0
wal:
dir: /var/loki/wal
schema_config:
configs:
- from: 2022-01-05
store: aws
object_store: s3
schema: v11
index:
prefix: prefix_index_
period: 24h
storage_config:
aws:
s3: s3://XXX
dynamodb:
dynamodb_url: dynamodb://ZZZ
limits_config:
unordered_writes: true
ingestion_rate_mb: 2048
ingestion_burst_size_mb: 4096
per_stream_rate_limit: 30MB
per_stream_rate_limit_burst: 150MB
max_streams_per_user: 0
max_global_streams_per_user: 0
table_manager:
retention_deletes_enabled: false
retention_period: 0s
index_tables_provisioning:
provisioned_read_throughput: 10000
query_range:
align_queries_with_step: true
max_retries: 5
split_queries_by_interval: 15m
cache_results: true
results_cache:
cache:
enable_fifocache: true
fifocache:
max_size_items: 1024
validity: 24h
frontend_worker:
frontend_address: YYY:9095
frontend:
log_queries_longer_than: 5s
compress_responses: true
tail_proxy_url: http://XXX:3100
I’m not sure how to approach this or where to start… It seems like chunk retrieval is not the issue, but something around processing is just extremely slow.
Any help is much appriciated
Thanks!