Loki scheduler restart chokes frontend / scheduler processing

Hi,

I’m facing a wired issue that after my query-scheduler node gets restarted the query requests start to fail in the query frontend ([query-frontend] failed mapping AST)

What I noticided is that the restarted scheduler doesnt receive queries anymore and the loki_query_scheduler_enqueue_count is not emitted anymore

But from what I read from the logs and the metrics the scheduler and frontend are still able to connect:

Intially a bit bumpy (connected/disconnected)


2025-05-14 12:46:12.333 [query-scheduler] scheduler is JOINING in the ring

2025-05-14 12:46:12.333 [query-scheduler] CAS attempt failed

2025-05-14 12:46:12.553 [query-scheduler] frontend connected

2025-05-14 12:46:12.553 [query-scheduler] frontend disconnected

2025-05-14 12:46:12.554 [query-frontend] error sending requests to scheduler

2025-05-14 12:46:12.565 [query-scheduler] frontend connected

2025-05-14 12:46:12.565 [query-scheduler] frontend disconnected

2025-05-14 12:46:12.567 [query-frontend] error sending requests to scheduler

2025-05-14 12:46:12.651 [query-scheduler] frontend connected

2025-05-14 12:46:12.651 [query-scheduler] frontend disconnected

2025-05-14 12:46:12.652 [query-frontend] error sending requests to scheduler

2025-05-14 12:46:12.807 [query-scheduler] frontend connected

2025-05-14 12:46:12.807 [query-scheduler] frontend disconnected

2025-05-14 12:46:12.808 [query-frontend] error sending requests to scheduler

2025-05-14 12:46:12.906 [query-scheduler] frontend connected

2025-05-14 12:46:12.906 [query-scheduler] frontend disconnected

2025-05-14 12:46:12.907 [query-frontend] error sending requests to scheduler

2025-05-14 12:46:12.956 [query-scheduler] Failed to join <redacted>:23637: dial tcp <redacted>:23637: connect: connection refused

2025-05-14 12:46:12.966 [query-scheduler] Initiating push/pull sync with: <redacted>:30474

2025-05-14 12:46:12.969 [query-frontend] Stream connection from=<redacted>:44954

2025-05-14 12:46:12.989 ....

2025-05-14 12:46:13.183 [query-scheduler] joining memberlist cluster succeeded

2025-05-14 12:46:13.334 [query-scheduler] waiting until scheduler is ACTIVE in the ring

2025-05-14 12:46:13.334 [query-scheduler] scheduler is ACTIVE in the ring

2025-05-14 12:46:13.334 [query-scheduler] module waiting for initialization

2025-05-14 12:46:13.334 [query-scheduler] starting

2025-05-14 12:46:13.334 [query-scheduler] Loki started

2025-05-14 12:46:14.352 [query-frontend] GET /loki/api/v1/tail? .... (404)

But then also “stable”


2025-05-14 12:46:14.817 [query-scheduler] frontend connected

2025-05-14 12:46:15.034 [query-scheduler] frontend connected

2025-05-14 12:46:15.038 [query-scheduler] frontend connected

2025-05-14 12:46:15.267 [query-scheduler] frontend connected

2025-05-14 12:46:15.576 [query-scheduler] frontend connected

2025-05-14 12:46:15.790 [query-scheduler] querier connected

2025-05-14 12:46:15.884 [query-scheduler] querier connected

2025-05-14 12:46:15.926 [query-scheduler] querier connected

2025-05-14 12:46:15.931 [query-scheduler] querier connected

2025-05-14 12:46:16.267 [query-scheduler] querier connected

2025-05-14 12:46:16.334 [query-scheduler] this scheduler is in the ReplicationSet, will now accept requests.

2025-05-14 12:46:16.464 [query-scheduler] querier connected

But incoming query request to the frontend are not forwared.

After I restart the query-frontend it starts to work again.

Is there any setting I missed? Is there a special graceful shutdown protocol between frontend and scheduler required?

Loki: 3.5.0 on Nomad

Thanks in advance!

My understanding is that query-frontend and query-scheduler don’t form ring membership, instead query-frontend connects to scheduler using the scheduler-address configuration.

What does that look like in your configuration? If you try to manually telnet from query-frontend to query-scheduler after restarting, does it work or no?

I’m using the ring based membership and the scheduler_address="" for frontend/worker

The the scheduler ring endpoint /scheduler/ring on the frontend worker indicates that the discovery still works.
During the restart the /scheduler/ring on frontend turns empty and after a short period the scheduler is in the list again. I think this also matches witht the logs.
I noticied that not every restart triggers that behaviour but most of the time. Currently I assume some kind of race condition

the scheduler config looks like this from the config endpoint

query_scheduler:
  max_outstanding_requests_per_tenant: 32000
  max_queue_hierarchy_levels: 3
  querier_forget_delay: 0s
  grpc_client_config:
    max_recv_msg_size: 104857600
    max_send_msg_size: 104857600
    grpc_compression: ""
    rate_limit: 0
    rate_limit_burst: 0
    backoff_on_ratelimits: false
    backoff_config:
      min_period: 100ms
      max_period: 10s
      max_retries: 10
    initial_stream_window_size: 63KiB1023B
    initial_connection_window_size: 63KiB1023B
    tls_enabled: true
    tls_cert_path: /secrets/loki/cert.pem
    tls_key_path: /secrets/loki/key.pem
    tls_ca_path: /secrets/loki/ca.pem
    tls_server_name: <redacted>
    tls_insecure_skip_verify: false
    tls_cipher_suites: ""
    tls_min_version: ""
    connect_timeout: 5s
    connect_backoff_base_delay: 1s
    connect_backoff_max_delay: 5s
    cluster_validation:
      label: ""
  use_scheduler_ring: true
  scheduler_ring:
    kvstore:
      store: memberlist
      prefix: collectors/
      consul:
        host: localhost:8500
        acl_token: ""
        http_client_timeout: 20s
        consistent_reads: false
        watch_rate_limit: 1
        watch_burst_size: 1
        cas_retry_delay: 1s
      etcd:
        endpoints: []
        dial_timeout: 10s
        max_retries: 10
        tls_enabled: false
        tls_cert_path: ""
        tls_key_path: ""
        tls_ca_path: ""
        tls_server_name: ""
        tls_insecure_skip_verify: false
        tls_cipher_suites: ""
        tls_min_version: ""
        username: ""
        password: ""
      multi:
        primary: ""
        secondary: ""
        mirror_enabled: false
        mirror_timeout: 2s
    heartbeat_period: 15s
    heartbeat_timeout: 1m0s
    tokens_file_path: ""
    zone_awareness_enabled: false
    num_tokens: 1
    replication_factor: 2
    instance_id: <redacted-scheduler-id
    instance_interface_names:
    - eth0
    - lo
    instance_port: <redacted-grpc-port>
    instance_addr: <redacted-ip-address>
    instance_availability_zone: ""
    instance_enable_ipv6: false