Grafana Loki rpc errors

Hi

I am trying to deploy loki chart version 0.80.0 in rosa cluster and trying to connect to promtail but I am getting below errors. Please help

In promtail pod
level=error ts=2025-03-12T07:42:34.746062055Z caller=client.go:430 component=client host=loki-ap.platform.test.saas.ibm.com msg=“final error sending batch” status=500 tenant= error=“server returned HTTP status 500 Internal Server Error (500): empty ring”
level=warn ts=2025-03-12T07:42:35.98991058Z caller=client.go:419 component=client host=loki-ap.platform.test.saas.ibm.com msg=“error sending batch, will retry” status=500 tenant= error=“server returned HTTP status 500 Internal Server Error (500): empty ring”

Loki Querier
level=warn ts=2025-03-12T06:12:53.066538782Z caller=logging.go:123 traceID=152e73a928c989f5 orgID=fake msg=“GET /loki/api/v1/index/stats?end=1741759972000000000&query=%7Bnamespace%3D%22consumption-metrics-backend%22%7D&start=1741759042000000000 (500) 10.194638ms Response: "rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 172.30.113.224:9095: connect: connection refused\"\n" ws: false; X-Query-Queue-Time: 15.257856ms; X-Scope-Orgid: fake; uber-trace-id: 152e73a928c989f5:52c42beccdf11381:79cc459dbb400e3e:0; "
level=debug ts=2025-03-12T06:12:53.137404562Z caller=async_store.go:135 org_id=fake traceID=50db1ca85f068349 msg=“queried statistics” matchers=”{namespace="consumption-metrics-backend"}" source=store

scheduler
level=debug ts=2025-03-12T06:12:47.879029705Z caller=grpc_logging.go:76 method=/schedulerpb.SchedulerForQuerier/QuerierLoop duration=132.82386ms err=“context canceled” msg=gRPC
level=debug ts=2025-03-12T06:12:47.879053255Z caller=grpc_logging.go:76 method=/schedulerpb.SchedulerForQuerier/QuerierLoop duration=132.637767ms err=“context canceled” msg=gRPC

ingester:
level=debug ts=2025-03-12T07:50:22.320215369Z caller=grpc_logging.go:76 method=/logproto.Querier/QuerySample duration=70.971µs err=“rpc error: code = Canceled desc = context canceled” msg=gRPC

Configmap:
common:
compactor_address: http://loki-loki-distributed-compactor:3100
auth_enabled: false
chunk_store_config:
max_look_back_period: 744h
chunk_cache_config:
embedded_cache:
enabled: true
compactor:
compaction_interval: 30m
retention_delete_delay: 1h
retention_delete_worker_count: 100
retention_enabled: true
shared_store: aws
working_directory: /var/loki/retention
distributor:
ring:
kvstore:
store: memberlist
querier:
max_concurrent: 4096
frontend:
compress_responses: true
max_outstanding_per_tenant: 500000
scheduler_address: loki-loki-distributed-query-scheduler:9096
log_queries_longer_than: 10s
tail_proxy_url: http://loki-loki-distributed-querier:3100
frontend_worker:
scheduler_address: loki-loki-distributed-query-scheduler:9096
parallelism: 327680
match_max_concurrent: true
ingester:
autoforget_unhealthy: true
chunk_block_size: 262144
max_chunk_age: 168h
chunk_encoding: snappy
chunk_idle_period: 30m
chunk_retain_period: 1m
lifecycler:
ring:
kvstore:
store: memberlist
replication_factor: 3
max_transfer_retries: 0
wal:
dir: /var/loki/wal
limits_config:
enforce_metric_name: false
ingestion_burst_size_mb: 10000
ingestion_rate_mb: 3000
ingestion_rate_strategy: local
max_cache_freshness_per_query: 10m
reject_old_samples: true
max_global_streams_per_user: 100000
per_stream_rate_limit: 6000M
per_stream_rate_limit_burst: 20000M
max_query_parallelism: 10240
max_query_series: 2096
reject_old_samples_max_age: 168h
retention_period: 8784h
split_queries_by_interval: 4h
query_timeout: 30m
tsdb_max_query_parallelism: 10240
memberlist:
join_members:

  • loki-loki-distributed-memberlist
    query_range:
    align_queries_with_step: true
    cache_results: true
    max_retries: 50
    results_cache:
    cache:
    enable_fifocache: true
    fifocache:
    max_size_items: 2048
    validity: 1h
    embedded_cache:
    enabled: true
    ttl: 1h
    ruler:
    alertmanager_url: https://alertmanager.xx
    external_url: https://alertmanager.xx
    ring:
    kvstore:
    store: memberlist
    rule_path: /tmp/loki/scratch
    storage:
    local:
    directory: /etc/loki/rules
    type: local
    schema_config:
    configs:
  • from: “2023-10-15”
    index:
    period: 24h
    prefix: loki_index_
    object_store: aws
    schema: v11
    store: boltdb-shipper
  • from: “2024-01-24”
    index:
    period: 24h
    prefix: loki_index_
    object_store: aws
    schema: v12
    store: tsdb
    server:
    grpc_listen_address: 0.0.0.0
    grpc_listen_port: 9096
    http_listen_port: 3100
    http_server_read_timeout: 600s
    http_server_write_timeout: 600s
    grpc_server_max_recv_msg_size: 4294967295
    grpc_server_max_send_msg_size: 4294967295
    grpc_server_max_concurrent_streams: 0
    log_level: debug
    grpc_server_keepalive_time: 30s
    grpc_server_keepalive_timeout: 300s
    grpc_server_max_connection_idle: 1h
    grpc_server_max_connection_age: 12h
    grpc_server_max_connection_age_grace: 30m
    index_gateway:
    mode: simple
    storage_config:
    aws:
    access_key_id: ******
    bucketnames: lokibucket-ap-test
    s3: s3://us-east-1
    secret_access_key: *******
    backoff_config:
    max_retries: 15
    http_config:
    response_header_timeout: 30s
    boltdb_shipper:
    active_index_directory: /var/loki/index
    cache_location: /var/loki/cache
    cache_ttl: 1m
    shared_store: s3
    index_gateway_client:
    server_address: dns:///loki-loki-distributed-index-gateway:9095
    tsdb_shipper:
    active_index_directory: /var/loki/tsdb-index
    cache_location: /var/loki/tsdb-cache
    cache_ttl: 1m
    shared_store: s3
    index_gateway_client:
    server_address: dns:///loki-loki-distributed-index-gateway:9095
    filesystem:
    directory: /var/loki/chunks
    table_manager:
    retention_deletes_enabled: true
    retention_period: 744h

Your configuration is a bit hard to read, but I would recommend you to:

  1. Focus on ingester first.
  2. Set count of ingester to 1 so you can focus your troubleshooting effort, turn on debug log and see if you can find anything obvious from the logs.