Loki/loki-read /httpgrpc.HTTP/Handle is experiencing Xs 99th percentile latency

Hi community, I can’t find root case why this alert is triggered, and how to improve situation. I start to facing this alert after I increased writes. I scaled writes replicas to 8, and I see that load was same as before for writes, but for reads there nothing changed, except this metric. I can’t find any documentation or other issues that would mention such issue

I also see increased metrics on ingester read while I can’t see any real load on loki read:

I think the ingester read is querier querying ingester, but I could be wrong.

I think some more information would be good. How much log are you writing? What’s the estimated size and volume in terms of lines of logs? Are you using labels heavily? What resource allocation is configured for your ingesters?

Also share your Loki configuration please.

Hi @tonyswumac and thank you for response :slight_smile:, hope next info will help:

  1. sum(rate(loki_distributor_bytes_received_total[5m])) / 1024 / 1024 = avg 1mb, up to 1.5mb at day, max is 2mb and lowest is 0.5mb
  2. sum(rate(loki_distributor_lines_received_total[5m])) = avg 2.05k, up to 3k at day, max 4.6k and lowest is 1.2k
  3. my labels are quite default promtails one that comes with k8s and a bit additional one like: environment, cluster, pod, pod_ip, pod_template_hash, image. Another relatively low part of logs are from journald or pure docker and they has even less labels. I do not parse json for example to create labels persistently.
  4. I have 3 caches: loki-results-cache-memcached (3 replicas with 64mb each), loki-index-cache-memcached (3 replicas with 64mb each) and loki-chunks-cache-memcached (3 replicas with 2gb each), non of them get to 100% of usage by max mem or cpu.
  5. I have 8 replicas of loki-read (requests: cpu=0.8, ram=700mb) and loki-write (requests: cpu=0.6, ram=980mb), and 2 replicas of loki-backend (requests: cpu=0.6, ram=160mb). I have also 1.5x limits on ram compared to requests but all requests are or near equal to actual usage or quite greater, especially for reads, so no OOMs. Also I don’t have limits on cpu, and I not saw any CPU throttling’s in k8s (I have alerts for such)
  6. Loki config is:
auth_enabled: false
      batch_size: 250
      parallelism: 50
      host: loki-chunks-cache-memcached
      service: memcache
  compactor_address: 'loki-backend'
  replication_factor: 3
      access_key_id: ${S3_ACCESS_KEY_ID}
      bucketnames: ${S3_BUCKET_NAME_CHUNKS}
      endpoint: ${S3_ENDPOINT}
        idle_conn_timeout: 59s
        response_header_timeout: 0s
      insecure: false
      s3forcepathstyle: true
      secret_access_key: ${S3_SECRET_ACCESS_KEY}
  compaction_interval: 10m
  retention_delete_delay: 2h
  retention_delete_worker_count: 150
  retention_enabled: true
  shared_store: s3
      store: memberlist
  scheduler_address: query-scheduler-discovery.loki.svc.cluster.local.:9095
  scheduler_address: query-scheduler-discovery.loki.svc.cluster.local.:9095
  mode: ring
  cardinality_limit: 200000
  deletion_mode: filter-and-delete
  enforce_metric_name: false
  ingestion_burst_size_mb: 8
  ingestion_rate_mb: 5
  max_cache_freshness_per_query: 10m
  max_chunks_per_query: 4000000
  max_concurrent_tail_requests: 250
  max_entries_limit_per_query: 5000
  max_global_streams_per_user: 10000
  max_query_length: 0
  max_query_series: 1000
  max_streams_matchers_per_query: 5000
  max_streams_per_user: 0
  query_timeout: 1m
  reject_old_samples: true
  reject_old_samples_max_age: 168h
  retention_period: 30d
  - period: 1d
    priority: 1
    selector: '{namespace="loki"}'
  split_queries_by_interval: 15m
  - loki-memberlist
  max_concurrent: 2
  align_queries_with_step: true
  cache_results: true
      default_validity: 12h
        host: loki-results-cache-memcached
        service: memcache
        timeout: 3s
  enable_api: true
  enable_sharding: true
    max_jitter: 15s
    mode: remote
      address: loki-read:9095
  - from: "2022-01-11"
      period: 24h
      prefix: loki_index_
    object_store: s3
    schema: v12
    store: boltdb-shipper
  graceful_shutdown_timeout: 120s
  grpc_listen_port: 9095
  grpc_server_max_recv_msg_size: 16777216
  grpc_server_max_send_msg_size: 16777216
  http_listen_port: 3100
  http_server_idle_timeout: 270s
  http_server_read_timeout: 60s
  http_server_write_timeout: 60s
    at: 250ms
    max_per_second: 20
    up_to: 3
      batch_size: 100
      expiration: 15m
      parallelism: 100
      host: loki-index-cache-memcached
      service: memcache
      timeout: 3s

Also don’t know why, but I feel like memcached by Loki used quite not fully, compared to how Thanos using it. For example in Thanos I have bucket memcached with 3 replicas, each of 4gb, and if I intensively start query data, it will be quickly filled with data and will not ask this data again from s3 (which quite slow compared to memchace). Don’t see that Loki really utilize full power of cache :frowning:, only small part of it.
Don’t know if it I missing something in configuration or code. I tried playing with timeouts, batch_size and parallelism but don’t see it’s changing anything. And I don’t see other options that could help.

I don’t use memcached for our Loki cluster, and I don’t actively keep an eye on the result cache, so I can’t comment on that.

I don’t see anything obviously wrong with your configuration, but I would try a couple of things:

  1. Try changing compaction_interval to something bigger, maybe 1h.

  2. Try having fewer ingester but more resources. In the interest of testing, I’d just give it like 4 CPU / 8GB memory, lower the count from 8 to 4, and see what it looks like.

  3. You can try using FIFO for result cache and see what that feels like. But I don’t think this would make any difference.

Thank you :slight_smile: , here a bit more question:

  1. Compactor is part of loki-backend, right? Don’t think it can be somehow related, but will try.
  2. I do not actually hard limit resources (as pod see all mem of host in k8s). Changed loki-read to 4 replicas, don’t see any changes in metrics unfortunately
  3. Will check that, thanks. Not used FIFO cache deployments yet, need to investigate

Regarding compactor, I just thought 10m interval is a bit too frequent. Whether that’s impactful is unknown.

What is your experience like? If you use something like logcli or just Loki API it gives you some metrics that you can use to evaluate the performance. For volume like 1.5MB per day you definitely should have minimum latency when querying.

About 10m interval - well I don’t explicitly put it by myself, it’s just defaults, based on Configuration | Grafana Loki documentation, I just explicitly put some interesting defaults in config. As I know compactor just check each 10m if he have anything to do, but if it doesn’t - he doing nothing.

About 1.5m it’s rate per a second in scope of 5m as I think, means ~5.4Gb. I mentioned prom metric I used as a base :slight_smile:, if I use instant query in loki now which not yet active hours: sum(bytes_over_time({namespace=~".+"}[1h])) / 1000 / 1000 I get 3.5Gb per hour based on it. If I try instant query on 24h range in Loki - it just hangs… I see only 1 loki read trying to do all job, and not split queries between each node… I need somehow instruct loki to properly spit load between read replicas. In Thanos for example it’s quite easier for me to understand and rule it, if I need spit load from S3 I just shard more thanos-store-gateway.

About fifo cache, as I see in loki documentation Configuration | Grafana Loki documentation, fifo stands as deprecated and now named as embedded_cache, which is enabled by default when no other cache is configured for queries and for bucket. Well, when I not yet enabled memcached on my initial setup of Loki performance of querying data and load was much worse, and as I can assume it was using embedded_cache, so in my case then memcache helped quite nice, but still not ideal, at least compared to how it helped with Thanos.

When I speak about Thanos - it stores in bucket cache all data it’s reads from S3, so it don’t need query it again, and it didn’t till it would be dropped due to max-mem or evict based on ttl. With Loki I don’t feel like this a case - because:

  1. when I doing same query that was done a short time ago I still see that Loki connects to S3 and tries to download tonns of data from it
  2. my memcached on Thanos always full up to 40% when I has 9Gb total memory (3 replicas with 3Gb each), and Thanos instantly writes to memcache ~8MiB of info, while Loki writes only 12.5Kib
  3. my memcached on Thanos has almost 100% hit rate, while on Loki - almost 0, I see it tries read data, but almost doesn’t write anything at all…

I also experimented with Redis before Memcached on both Thanos and Loki, and from my tests Memcached get quite better results, as it scales horizontally better, designed much simpler and eat less overhead on operation still providing needed caching that works quicker then Redis because of simplicity. And I used both Redis and Memcahced in other projects quite much, like Memcached+Mcrouter for Postfixes, or Redis for Rspamd, etc.

Oh I found why my memcache start to not work and was worked before, it’s a but in 2.9.4: Memcache not being used by loki in simple scalable mode · Issue #11821 · grafana/loki (github.com) - update to 2.9.5 fixed an issue! alert gone, memcached full of data, and everything start working much better. Just currious why such cirtical issue go so silent…

About parallelism I found a useful articular that describes a bit details about how to fix splitting queries How to improve query performance in Grafana Loki it’s helped a bit