POST /loki/api/v1/push (500) Response: empty ring and context canceled

Hello,

I have built a test EKS environment to apply Loki for a few weeks, so far I have created 10s of Loki’s pods using Monolithic mode, each Loki’s pod received logs from multiple ec2 instances.

However, I have found promtail keep returning error messages below,

error sending batch, will retry" status=500 error="server returned HTTP status 500 Internal Server Error (500): empty ring"
or
error sending batch, will retry" status=-1 error="Post \"loki:port/loki/api/v1/push\": context deadline exceeded"

When I check the logs from Loki, I have found similar error messages:

level=warn ts=2022-11-07T08:59:39.648738164Z caller=logging.go:86 traceID=414f3905fdec9c5b orgID=fake msg="POST /loki/api/v1/push (500) 5.863871ms Response: \"empty ring\\n\" ws: false; Content-Length: 267464; Content-Type: application/x-protobuf; User-Agent: promtail/2.6.1; X-Amzn-Trace-Id: Root=id; X-Forwarded-For: IP; X-Forwarded-Port: PORT; X-Forwarded-Proto: http; "

level=warn ts=2022-11-07T09:23:23.193157476Z caller=logging.go:86 traceID=0a15f11b53b377ce orgID=fake msg="POST /loki/api/v1/push (500) 9.998004113s Response: \"context canceled\\n\" ws: false; Content-Length: 235297; Content-Type: application/x-protobuf; User-Agent: promtail/2.6.1; X-Amzn-Trace-Id: Root=id; X-Forwarded-For: IP; X-Forwarded-Port: PORT; X-Forwarded-Proto: http; "

After this issue started, Grafana failed to call resources on and off, which means Loki’s pod is reachable in a short period of time, started collecting logs busily, and failed again.

I am not sure why this situation happened in some of my Loki pods only as the config are nearly the same.

Can anyone know how to solve this problem? What does “empty ring” mean?

Thank you!

My config:
promtail.yaml

server:
  http_listen_port: 9080
clients:
  - url: http://ip_where_Loki_run:3100/loki/api/v1/push
positions:
  filename: /usr/local/promtail/positions.yaml
scrape_configs:
  - job_name: server_log
    static_configs:
      - targets:
          - localhost
        labels:
          job: server_log
          hostname: ab
          __path__: /var/log/server.log

Loki.yaml


    auth_enabled: false 
    
    server: 
      http_listen_port: 3100 
      grpc_listen_port: 9096
      grpc_server_max_recv_msg_size: 104857600
      grpc_server_max_send_msg_size: 104857600
      http_server_read_timeout: 300s
      http_server_write_timeout: 300s
      http_server_idle_timeout: 300s
    
    ingester: 
      wal: 
        enabled: true 
        dir: /loki/wal 
      lifecycler: 
        ring: 
          kvstore: 
            store: inmemory
          replication_factor: 1 
        final_sleep: 0s 
      chunk_idle_period: 3m       
      chunk_retain_period: 30s
      chunk_encoding: lz4     
      max_transfer_retries: 0     
      chunk_target_size: 1048576  
      max_chunk_age: 1h           
    
    schema_config: 
      configs: 
        - from: 2022-10-05
          store: boltdb-shipper 
          object_store: aws 
          schema: v12 
          index: 
            prefix: index_ 
            period: 24h 
    
    storage_config: 
      boltdb_shipper: 
        active_index_directory: /loki/index
        cache_location: /loki/index_cache
        shared_store: s3 

      aws:
        bucketnames:  bucketnames
        endpoint: s3.us-west-2.amazonaws.com
        region: us-west-2
        access_key_id: access_key_id
        secret_access_key: secret_access_key
        sse_encryption: true

    compactor: 
      working_directory: /loki/compactor 
      shared_store: s3 
      compaction_interval: 5m
      retention_enabled: true
    
    limits_config: 
      reject_old_samples: true 
      reject_old_samples_max_age: 720h
      retention_period: 720h
      per_stream_rate_limit: 15MB
      per_stream_rate_limit_burst: 30MB
      ingestion_rate_mb: 15
      ingestion_burst_size_mb: 30

    chunk_store_config: 
      max_look_back_period: 0s 

    querier:
      query_ingesters_within: 0
      engine:
        max_look_back_period: 3m
    
    query_scheduler:
      max_outstanding_requests_per_tenant: 2048

    query_range:
      parallelise_shardable_queries: false
      split_queries_by_interval: 0
    
    frontend:
      max_outstanding_per_tenant: 10240

    ingester_client:
      remote_timeout: 30s

How are you fix issue ?

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.