Loki distributor does not load balncing

zerofive83 · February 18, 2022, 2:03am

hello,

I am using Loki-distributed on EKS.
this is version information

chart-version: 0.43.0 / loki-version: 2.4.2

I know distributor do load balancing to log ingestion.
but, actually distributor does not load balancing in my cluster.

I have 9 ingester pod. and “replication_factor” is “1”.
there is only one pod do working. other 8 pod is idle.

if I change to “replication_factor” to “2”
two pod will work, other 7 pod is idle.

I think this is very very big problem.
if this problem is not fixing, I can’t use Loki

could you help me please?
this is my configuration.

auth_enabled: false
chunk_store_config:
  chunk_cache_config:
    enable_fifocache: true
    fifocache:
      max_size_bytes: 512MB
  max_look_back_period: 300s
compactor:
  shared_store: filesystem
distributor:
  ring:
    kvstore:
      store: memberlist
frontend:
  compress_responses: true
  log_queries_longer_than: 1m
  max_outstanding_per_tenant: 4096
  tail_proxy_url: http://loki-querier:3100
frontend_worker:
  frontend_address: loki-query-frontend:9095
  grpc_client_config:
    max_recv_msg_size: 1048576000
    max_send_msg_size: 1048576000
  match_max_concurrent: false
  parallelism: 8
ingester:
  chunk_block_size: 1536000
  chunk_encoding: snappy
  chunk_idle_period: 30m
  chunk_retain_period: 1m
  chunk_target_size: 1536000
  lifecycler:
    join_after: 0s
    ring:
      kvstore:
        store: memberlist
      replication_factor: 1
  max_chunk_age: 1h
  max_transfer_retries: 0
  wal:
    dir: /var/loki/wal
limits_config:
  enforce_metric_name: false
  ingestion_burst_size_mb: 100
  ingestion_rate_mb: 100
  max_cache_freshness_per_query: 10m
  max_entries_limit_per_query: 10000
  max_global_streams_per_user: 0
  max_query_length: 720h
  max_streams_per_user: 0
  per_stream_rate_limit: 30MB
  per_stream_rate_limit_burst: 50MB
  reject_old_samples: true
  reject_old_samples_max_age: 168h
memberlist:
  join_members:
  - loki-memberlist
querier:
  engine:
    timeout: 5m
  max_concurrent: 2048
  query_ingesters_within: 1h
  query_timeout: 5m
query_range:
  align_queries_with_step: true
  cache_results: false
  max_retries: 5
  parallelise_shardable_queries: true
  results_cache:
    cache:
      enable_fifocache: true
      fifocache:
        max_size_bytes: 1GB
  split_queries_by_interval: 10m
ruler:
  alertmanager_url: http://alertmanager:9093
  enable_api: true
  rule_path: /tmp/loki/scratch
  storage:
    s3:
      s3: s3://ap-northeast-2/{{ rule_bucket_name }}
    type: s3
schema_config:
  configs:
  - from: "2021-12-24"
    index:
      period: 720h
      prefix: {{ index_name }}_
    object_store: s3
    schema: v11
    store: aws
server:
  grpc_server_max_recv_msg_size: 1048576000
  grpc_server_max_send_msg_size: 1048576000
  grpc_server_min_time_between_pings: 10s
  grpc_server_ping_without_stream_allowed: true
  http_listen_port: 3100
  http_server_idle_timeout: 300s
  http_server_write_timeout: 60s
storage_config:
  aws:
    dynamodb:
      dynamodb_url: dynamodb://ap-northeast-2
    http_config:
      response_header_timeout: 5s
    s3: s3://ap-northeast-2/{{ bucket_name }}
  boltdb_shipper:
    active_index_directory: /var/loki/index
    cache_location: /var/loki/cache
    cache_ttl: 1h
    index_gateway_client:
      server_address: dns://loki-index-gateway:9095
    shared_store: s3
  index_cache_validity: 1h
  index_queries_cache_config:
    enable_fifocache: true
    fifocache:
      max_size_bytes: 512MB
table_manager:
  chunk_tables_provisioning:
    enable_inactive_throughput_on_demand_mode: true
    enable_ondemand_throughput_mode: true
    inactive_read_throughput: 0
    inactive_write_throughput: 0
    provisioned_read_throughput: 0
    provisioned_write_throughput: 0
  index_tables_provisioning:
    enable_inactive_throughput_on_demand_mode: true
    enable_ondemand_throughput_mode: true
    inactive_read_throughput: 0
    inactive_write_throughput: 0
    provisioned_read_throughput: 0
    provisioned_write_throughput: 0
  retention_deletes_enabled: false
  retention_period: 0
  throughput_updates_disabled: false

b0b · March 4, 2022, 8:14am

I think this is the expected behavior

From Configuration | Grafana Labs

   # The number of ingesters to write to and read from.
    # CLI flag: -distributor.replication-factor
    [replication_factor: <int> | default = 3]

system · March 4, 2023, 8:14am

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Netflow inbalance over ingesters Grafana Loki loki	2	481	August 14, 2024
Tips for troubleshooting Ingester pod memory imbalance when running Loki in distributed mode in Kubernetes Grafana Loki loki , performance	3	716	April 5, 2024
Loki Ingester Instances with Severely Unbalanced Logging Distribution Grafana Loki loki	1	65	May 29, 2024
Distributed loki question on EC2 Grafana Loki	2	295	April 2, 2024
Scalability and Optimal Throughput in Loki Benchmarking Grafana Loki	1	568	August 22, 2024

Loki distributor does not load balncing

Related topics