Unable to join ring - unknown message type G

I am attempting to deploy the LGTM stack to my RKE2 cluster, v1.28.5, with Cilium as the CNI with kube-proxy replacement, using Argo and the official Helm chart. I am overriding the global.dnsService to use rke2-coredns-rke2-coredns. The only other thing I am doing is specifying the Ceph bucket to use to store logs. I keep getting these errors (on the memberlist pods), and am unsure why this is happening:

ts=2024-07-13T23:25:33.129173693Z caller=memberlist_logger.go:74 level=warn msg="Failed to resolve lgtm-distributed-loki-memberlist: lookup lgtm-distributed-loki-memberlist on 10.43.0.10:53: no such host"                                    β”‚
β”‚ level=warn ts=2024-07-13T23:25:33.129224196Z caller=memberlist_client.go:595 msg="joining memberlist cluster: failed to reach any nodes" retries=4 err="1 error occurred:\n\t* Failed to resolve lgtm-distributed-loki-memberlist: lookup lgtm- β”‚
β”‚ level=error ts=2024-07-13T23:26:00.098884343Z caller=tcp_transport.go:319 component="memberlist TCPTransport" msg="unknown message type" msgType=G remote=10.42.1.128:49114                                                                     β”‚
β”‚ level=info ts=2024-07-13T23:26:02.491209106Z caller=memberlist_client.go:592 msg="joining memberlist cluster succeeded" reached_nodes=1 elapsed_time=53.214729221s

Here is my config:

        loki:
          global:
            dnsService: "rke2-coredns-rke2-coredns"
          gateway:
            extraEnvFrom:
              - configMapRef:
                  name: loki-bucket
              - secretRef:
                  name: loki-bucket
          loki:
            podAnnotations:
              prometheus.io/scrape: "true"
              prometheus.io/port: "3100"
            # https://grafana.com/docs/loki/latest/configure/#aws_storage_config
            storageConfig:
              tsdb_shipper:
                active_index_directory: /loki/index
                cache_location: /loki/index_cache
                cache_ttl: 24h 
              aws:
                s3: "http://rook-ceph-rgw-ceph-objectstore.rook-ceph.svc"
                bucketnames: "loki-3926889a-30e8-43fc-af3a-439620560568"
                s3forcepathstyle: true
            schemaConfig:
              configs:
                - from: 2020-09-07
                  store: boltdb-shipper
                  object_store: aws
                  schema: v11
                  index:
                    prefix: loki_index_
                    period: 24h

The issue was that there were not enough members part of the ring, just scaled the StatefulSet to at least 2 replicas (and make sure the PVC was not full) and it resolved the issue…

1 Like