Loki: total worker concurrency is greater than promql max concurrency

timansky · June 22, 2021, 4:49am

Need help, i’m having an issue with promql max concurrency: unable to find decription how to count promql max concurrency

Didn’t find any mentions about this in docs

---
auth_enabled: true

schema_config:
  configs:
    - from: 2020-12-01
      store: boltdb-shipper
      object_store: s3
      schema: v11
      index:
        prefix: loki_index_
        period: 24h
      chunks:
        prefix: loki_chunk_
        period: 24h

storage_config:
  aws:
    bucketnames: loki-00,loki-01,loki-02,loki-03,loki-04,loki-05,loki-06,loki-07,loki-08,loki-09
    endpoint: s3-host:7480
    region: US
    access_key_id: <id>
    secret_access_key: <key>
    insecure: true
    sse_encryption: false
    http_config:
      idle_conn_timeout: 90s
      response_header_timeout: 0s
      insecure_skip_verify: true
    s3forcepathstyle: true

  boltdb_shipper:
    shared_store: s3
    active_index_directory: /var/lib/loki/boltdb-shipper-active
    cache_location: /var/lib/loki/boltdb-shipper-cache
    cache_ttl: 168h
    resync_interval: 5m
    query_ready_num_days: 2

  index_cache_validity: 2h
  max_chunk_batch_size: 100

  index_queries_cache_config:
    memcached:
      expiration: 1h
      batch_size: 100
      parallelism: 100
    memcached_client:
      host: loki-cache.monitoring
      service: index
      consistent_hash: true

chunk_store_config:
  max_look_back_period: 720h
  cache_lookups_older_than: 1d
  chunk_cache_config:
    memcached:
      expiration: 1h
      batch_size: 100
      parallelism: 100
    memcached_client:
      host: loki-cache.monitoring
      service: chunks
      consistent_hash: true

  write_dedupe_cache_config:
    memcached:
      expiration: 1h
      batch_size: 100
      parallelism: 100
    memcached_client:
      host: loki-cache.monitoring
      service: dedupe
      consistent_hash: true

memberlist:
  gossip_nodes: 2
  randomize_node_name: false
  abort_if_cluster_join_fails: true
  bind_port: 7946
  join_members:
    - loki-gossip-ring.monitoring:7946
  max_join_backoff: 1m
  max_join_retries: 10
  min_join_backoff: 1s
  retransmit_factor: 3
  gossip_to_dead_nodes_time: 15s
  left_ingesters_timeout: 5s
  dead_node_reclaim_time: 30s

server:
  http_listen_port: 3100
  grpc_listen_port: 9095

distributor:
  ring:
    kvstore:
      store: memberlist

ingester:
  chunk_block_size: 262144
  chunk_target_size: 1536000 
  max_chunk_age: 3h  
  max_transfer_retries: 0     
  chunk_idle_period: 1h       
  chunk_retain_period: 5m     
  chunk_encoding: snappy      
  sync_period: 1m
  sync_min_utilization: 0.7
  lifecycler:
    ring:
      kvstore:
        store: memberlist
      replication_factor: 2
      heartbeat_timeout: 1m
    heartbeat_period: 5s
    final_sleep: 0s      
    min_ready_duration: 10s
  query_store_max_look_back_period: 0 
  wal:
    enabled: true
    dir: /var/lib/loki/wal
    flush_on_shutdown: true
    replay_memory_ceiling: 2GB

querier:
  query_timeout: 15s
  tail_max_duration: 15m
  query_ingesters_within: 3h

query_range:
  align_queries_with_step: true 
  max_retries: 3
  split_queries_by_interval: 15m
  parallelise_shardable_queries: true
  cache_results: true
  results_cache:
    cache:
      memcached:
        expiration: 1h
      memcached_client:
        host: loki-cache.monitoring
        service: results
        consistent_hash: true

ingester_client:
  grpc_client_config:
    rate_limit: 30
    rate_limit_burst: 50
    max_recv_msg_size: 33554432 #1024 * 1024 * 32
  remote_timeout: 15s

frontend_worker:
  frontend_address: loki-querier-frontend.monitoring:9095
  parallelism: 10
  grpc_client_config:
    rate_limit: 30
    rate_limit_burst: 50

frontend:
  log_queries_longer_than: 5s
  compress_responses: true
  tail_proxy_url: http://loki-querier.monitoring:3100
  max_outstanding_per_tenant: 1024

table_manager:
  retention_deletes_enabled: true
  retention_period: 720h
  poll_interval: 10m
  creation_grace_period: 3h

limits_config:
  ingestion_rate_strategy: global
  reject_old_samples: true
  reject_old_samples_max_age: 24h
  ingestion_rate_mb: 30
  ingestion_burst_size_mb: 50
  max_query_length: 720h
  max_streams_per_user: 0 
  max_global_streams_per_user: 100000  # 100k
  # max_query_parallelism: 5
  max_cache_freshness_per_query: 30m 
  enforce_metric_name: false

compactor:
  working_directory: /var/lib/loki/boltdb-shipper-compactor
  shared_store: s3
  compaction_interval: 4h

tracing:
  enabled: false

kavikanagaraj · June 22, 2021, 8:30am

Hi @timansky. I assume you mean logql not promql.

There is a config to set the max concurrency at the querier. -querier.max-concurrent. May be this is what you looking for?

$ loki --help
..
-querier.max-concurrent int
    	The maximum number of concurrent queries. (default 20)
...

Yes, you are right. Docs in Loki is missing this config!. It will soon be added Add missing `-querier.max-concurrent` config in the doc by kavirajk · Pull Request #3875 · grafana/loki · GitHub

timansky · June 22, 2021, 3:41pm

thx, it is it

about promql it is exactly what was written in logs, not my mistake

PS
Saw your pull request, it is better to have some explanation about formula, how to calculate.

kavikanagaraj · June 25, 2021, 8:30am

@timansky Yea you are right. The exact log line comes from one of the Loki’s dependency Cortex project.

Now about the formula.

In general -querier.max-concurrent is the maximum number of top-level LogQL queries that will execute at the same time, per querier process

There can be two cases.

If you are using query-frontend.
This should be set to at least ( -querier.worker-parallelism * number of query frontend replicas). Otherwise queries may queue in the queriers and not the frontend, which will affect QoS
If you are not using query-frontend.
Then consider setting -querier.worker-match-max-concurrent to true to force worker parallelism to match -querier.max-concurrent

Does this help? I will add it to the docs soon!

timansky · June 25, 2021, 11:08am

thx, allready found this in cortex code

system · June 25, 2022, 11:09am

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.