Cortex Ring is Unstable

I’m trying to figure out what is wrong with my loki service before moving on to setting up tempo and cortex. It seems that loki is not able to smoothly handle one of its shards going away and being replaced. In my testing I have tried both consul as the ring storage and memberlist. I have tried supplying memberlist with a complete list of members, and with a DNS entry that will resolve to a random subset, in all cases the ring winds up unstable and inconsistent.

As loki is advertised as being cloud native, I have to assume that this is a problem in my configuration, as any cloud native software must fundamentally account for the individual components of a cloud system being individually unreliable. So far this has not been the case, and at this time I have resorted to a twice daily check to make sure that the output of /ring shows no members that shouldn’t be there. I hope that the expected deployment of Loki doesn’t involve a human garbage collector.

As I write this, my local test cluster where I was evaluating memberlist is split brained. I assumed that the consul ring storage was a second class option since it is not documented as well as memberlist and have tried this in my testing cluster. It works fine on initial ring setup, but quickly suffers the same problems as above with the addition that reliably split brains to a 2 node ring and a 1 node ring when attempting to drain an underlying machine. I haven’t gotten to the point of simulating an unexpected machine fault (I figure that doesn’t matter if steady state operation doesn’t actually work).

Is this normal and just the way loki works? Is there some setting I’m missing to get loki to recover state or not split brain at the drop of a hat? I have many other services running in this cluster that make use of distributed state and hash rings, while only loki has these problems.

Here’s the complete configuration from my local testing cluster running on Nomad:

job "loki" {
  datacenters = ["minicluster"]
  type = "service"

  group "aio" {
    count = 3

    spread {
      attribute = "${node.unique.id}"
    }

    network {
      mode = "host"
      port "http" { static = 3100 }
      port "grpc" { static = 9095 }
      port "memberlist" { static = 7946 }
    }

    service {
      name = "loki"
      port = "http"

      check {
        type = "http"
        path = "/ready"
        port = "http"
        address_mode = "host"
        interval = "5s"
        timeout = "2s"
      }
    }

    service {
      name = "loki-memberlist"
      port = "memberlist"
    }

    task "aio" {
      driver = "docker"

      config {
        image = "grafana/loki:2.1.0"
        network_mode="host"

        args = [
          "-config.file=/local/loki.yml",
          "-server.http-listen-address=${NOMAD_IP_http}",
          "-server.http-listen-port=${NOMAD_PORT_http}",
          "-server.grpc-listen-address=${NOMAD_IP_grpc}",
          "-server.grpc-listen-port=${NOMAD_PORT_grpc}",
          "-log.level=debug",
        ]
      }

      template {
        data = file("./aio.yml")
        destination = "local/loki.yml"
        perms = "0644"
        splay = "30s"
      }

      restart {
        attempts = 100
      }
    }
  }
}
---
auth_enabled: false

server:
  http_listen_port: 3100
  http_listen_address: 0.0.0.0
  grpc_listen_port: 9095
  grpc_listen_address: 0.0.0.0

ingester:
  lifecycler:
    final_sleep:  "0s"
    ring:
      kvstore:
        store: memberlist
  chunk_idle_period: "1m"
  chunk_retain_period: "30s"

memberlist:
  abort_if_cluster_join_fails: false
  dead_node_reclaim_time: 30s

  bind_port: {{env "NOMAD_PORT_memberlist"}}

  join_members:
  - loki-memberlist.service.consul:{{env "NOMAD_PORT_memberlist"}}
  
  max_join_backoff: 1m
  max_join_retries: 10
  min_join_backoff: 1s

storage_config:
  boltdb_shipper:
    active_index_directory: "/loki/index"
    cache_location: "/loki/index_cache"
    resync_interval: "5s"
    shared_store: s3

  aws:
    bucketnames: loki
    endpoint: minio.service.consul:9000
    access_key_id: minioadmin
    secret_access_key: minioadmin
    s3forcepathstyle: true
    insecure: true

limits_config:
  enforce_metric_name: false
  reject_old_samples: true
  reject_old_samples_max_age: "168h"

compactor:
  working_directory: /loki/compactor
  shared_store: s3

schema_config:
  configs:
    - from: "2021-01-26"
      store: "boltdb-shipper"
      object_store: s3
      schema: "v11"
      index:
        prefix: "index_"
        period: "24h"

This may not be the most helpful reply, but maybe this info will help. I’ve tried using memberlist in our k8s cluster but it did not work well when Istio is also being used. I was already using Consul for the ring but was hoping memberlist would work so I didn’t need to run Consul.

That being said Consul as the ring works quite well. The only manual intervention I’ve had to do is delete the ring key when scaling down substantially.

The following is the config we have for using consul for the ingester ring.

    ring:
      kvstore:
        consul:
          consistent_reads: true
          host: ${var.consul}:8500
        prefix: "loki/collectors"
        store: consul

We also use Consul for the rings needed by Cortex and Tempo.

1 Like

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.