Loki-distributed helm install: field index_gateway_client not found in type storage.Config

I’m using the loki-distributed Helm chart to deploy Loki in distributed mode:

I actually have my own chart of charts deploying Loki along with some other dependencies but this is a snippet from my Helm chart showing the chart repo and chart I’m using:

dependencies:
  - name: loki-distributed
    version: 0.42.0
    repository: https://grafana.github.io/helm-charts
    alias: loki

My values.yaml looks like this:

loki:
  ingestor:
    replicas: 1
    persistence:
      # -- Enable creating PVCs which is required when using boltdb-shipper
      enabled: true
      # -- Size of persistent disk
      size: 10Gi
      # -- Storage class to be used.
      # If defined, storageClassName: <storageClass>.
      # If set to "-", storageClassName: "", which disables dynamic provisioning.
      # If empty or set to null, no storageClassName spec is
      # set, choosing the default provisioner (gp2 on AWS, standard on GKE, AWS, and OpenStack).
      storageClass: null
  
  distributor:
    replicas: 1
  
  querier:
    replicas: 1

  ruler:
    enabled: false
    replicas: 1

  indexGateway:
    enabled: true
    replicas: 1
    persistence:
      # -- Enable creating PVCs which is required when using boltdb-shipper
      enabled: true
      # -- Size of persistent disk
      size: 10Gi
      # -- Storage class to be used.
      # If defined, storageClassName: <storageClass>.
      # If set to "-", storageClassName: "", which disables dynamic provisioning.
      # If empty or set to null, no storageClassName spec is
      # set, choosing the default provisioner (gp2 on AWS, standard on GKE, AWS, and OpenStack).
      storageClass: null

  queryFrontend:
    replicas: 1

  gateway:
    replicas: 1

  compactor:
    enabled: true
    persistence:
      # -- Enable creating PVCs for the compactor
      enabled: false
      # -- Size of persistent disk
      size: 10Gi
      # -- Storage class to be used.
      # If defined, storageClassName: <storageClass>.
      # If set to "-", storageClassName: "", which disables dynamic provisioning.
      # If empty or set to null, no storageClassName spec is
      # set, choosing the default provisioner (gp2 on AWS, standard on GKE, AWS, and OpenStack).
      storageClass: null
    serviceAccount:
      create: true

My intention is to use boltdb-shipper for local storage as I don’t want to run my storage on Azure, AWS, Google, etc. and from reading the Single Store Loki (boltdb-shipper index type) guide it seems to recommend using the Index Gateway to cut down on use of persistent disk. Hence, I enabled the indexGateway:

  indexGateway:
    enabled: true

Unfortunately, when I deploy, my compactor, distributor, querier, and query-frontend pods all fail upon startup with this exact same error:

failed parsing config: /etc/loki/config/config.yaml: yaml: unmarshal errors:
  line 51: field index_gateway_client not found in type storage.Config

These deployments all seem to be reading /etc/loki/config/config.yaml from the same ConfigMap: obs-loki in my case (from kubectl get deployment obs-loki-query-frontend -o yaml):

      volumes:
      - configMap:
          defaultMode: 420
          name: obs-loki
        name: config

So I took a peek at that ConfigMap and it seems fine to me… I can see the index_gateway_client field there under storage_config so I am a bit perplexed:

apiVersion: v1
data:
  config.yaml: |
    auth_enabled: false

    server:
      http_listen_port: 3100

    distributor:
      ring:
        kvstore:
          store: memberlist

    memberlist:
      join_members:
        - obs-loki-memberlist

    ingester:
      lifecycler:
        ring:
          kvstore:
            store: memberlist
          replication_factor: 1
      chunk_idle_period: 30m
      chunk_block_size: 262144
      chunk_encoding: snappy
      chunk_retain_period: 1m
      max_transfer_retries: 0
      wal:
        dir: /var/loki/wal

    limits_config:
      enforce_metric_name: false
      reject_old_samples: true
      reject_old_samples_max_age: 168h
      max_cache_freshness_per_query: 10m
    schema_config:
      configs:
      - from: "2020-09-07"
        index:
          period: 24h
          prefix: loki_index_
        object_store: filesystem
        schema: v11
        store: boltdb-shipper
    storage_config:
      boltdb_shipper:
        active_index_directory: /var/loki/index
        cache_location: /var/loki/cache
        cache_ttl: 168h
        shared_store: filesystem
      filesystem:
        directory: /var/loki/chunks
      index_gateway_client:
        server_address: dns:///obs-loki-index-gateway:9095

    chunk_store_config:
      max_look_back_period: 0s

    table_manager:
      retention_deletes_enabled: false
      retention_period: 0s

    query_range:
      align_queries_with_step: true
      max_retries: 5
      split_queries_by_interval: 15m
      cache_results: true
      results_cache:
        cache:
          enable_fifocache: true
          fifocache:
            max_size_items: 1024
            validity: 24h

    frontend_worker:
      frontend_address: obs-loki-query-frontend:9095

    frontend:
      log_queries_longer_than: 5s
      compress_responses: true
      tail_proxy_url: http://obs-loki-querier:3100

    compactor:
      shared_store: filesystem

    ruler:
      storage:
        type: local
        local:
          directory: /etc/loki/rules
      ring:
        kvstore:
          store: memberlist
      rule_path: /tmp/loki/scratch
      alertmanager_url: https://alertmanager.xx
      external_url: https://alertmanager.xx
kind: ConfigMap
metadata:
  annotations:
    meta.helm.sh/release-name: obs
    meta.helm.sh/release-namespace: obs
  creationTimestamp: "2022-01-27T17:26:39Z"
  labels:
    app.kubernetes.io/instance: obs
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: loki
    app.kubernetes.io/version: 2.4.2
    helm.sh/chart: loki-0.42.0
  name: obs-loki
  namespace: obs
  resourceVersion: "1393458"
  uid: 919c4e28-bd5d-4f18-af0b-17653d58c505

Is there something wrong with the Helm chart or do I just have some value set incorrectly?

I was toying around with this and I think I figured out the root of the problem.

The obs-loki config map has the storage_config section of /etc/loki/config/config.yaml looking like this:

    storage_config:
      boltdb_shipper:
        active_index_directory: /var/loki/index
        cache_location: /var/loki/cache
        cache_ttl: 168h
        shared_store: filesystem
      filesystem:
        directory: /var/loki/chunks
      index_gateway_client:
        server_address: dns:///obs-loki-index-gateway:9095

According to the Configuring Grafana Loki guide, the index_gateway_client section should be under boltdb_shipper like this:

    storage_config:
      boltdb_shipper:
        active_index_directory: /var/loki/index
        cache_location: /var/loki/cache
        cache_ttl: 168h
        shared_store: filesystem
      index_gateway_client:
        server_address: dns:///obs-loki-index-gateway:9095
      filesystem:
        directory: /var/loki/chunks

I manually edited my ConfigMap in my cluster and this started working.

So there seems to be a bug in the loki-distributed Helm chart. In values.yaml / loki / config we have this:

    storage_config:
    {{- toYaml .Values.loki.storageConfig | nindent 2}}
    {{- if .Values.indexGateway.enabled}}
      index_gateway_client:
        server_address: dns:///{{ include "loki.indexGatewayFullname" . }}:9095
    {{- end}}
    {{- end}}

But then if we look at the loki.storageConfig section, it is set to this:

  storageConfig:
    boltdb_shipper:
      shared_store: filesystem
      active_index_directory: /var/loki/index
      cache_location: /var/loki/cache
      cache_ttl: 168h
    filesystem:
      directory: /var/loki/chunks

Notice that the filesystem setting being there throws off that Helm template above, resulting in the incorrectly structured config:

    storage_config:
      boltdb_shipper:
        active_index_directory: /var/loki/index
        cache_location: /var/loki/cache
        cache_ttl: 168h
        shared_store: filesystem
      filesystem:
        directory: /var/loki/chunks
      index_gateway_client:
        server_address: dns:///obs-loki-index-gateway:9095

FYI: I created an issue for this in github – [loki-distributed] storage_config index_gateway_client setting put in incorrect spot · Issue #1003 · grafana/helm-charts · GitHub

In my case I have a chart of charts deploying other applications too so to fix this for my immediate use-case, I set the hard-coded the storage_config in my chart’s values.yaml to avoid the issue with the templating… Not ideal but it works for now:

loki:
  loki:
    config: |
      auth_enabled: false

      server:
        http_listen_port: 3100

      distributor:
        ring:
          kvstore:
            store: memberlist

      memberlist:
        join_members:
          - {{ include "loki.fullname" . }}-memberlist

      ingester:
        lifecycler:
          ring:
            kvstore:
              store: memberlist
            replication_factor: 1
        chunk_idle_period: 30m
        chunk_block_size: 262144
        chunk_encoding: snappy
        chunk_retain_period: 1m
        max_transfer_retries: 0
        wal:
          dir: /var/loki/wal

      limits_config:
        enforce_metric_name: false
        reject_old_samples: true
        reject_old_samples_max_age: 168h
        max_cache_freshness_per_query: 10m

      {{- if .Values.loki.schemaConfig}}
      schema_config:
      {{- toYaml .Values.loki.schemaConfig | nindent 2}}
      {{- end}}
      storage_config:
        boltdb_shipper:
          active_index_directory: /var/loki/index
          cache_location: /var/loki/cache
          cache_ttl: 168h
          shared_store: filesystem
          index_gateway_client:
            server_address: dns:///obs-loki-index-gateway:9095
        filesystem:
          directory: /var/loki/chunks

      chunk_store_config:
        max_look_back_period: 0s

      table_manager:
        retention_deletes_enabled: false
        retention_period: 0s

      query_range:
        align_queries_with_step: true
        max_retries: 5
        split_queries_by_interval: 15m
        cache_results: true
        results_cache:
          cache:
            enable_fifocache: true
            fifocache:
              max_size_items: 1024
              validity: 24h

      frontend_worker:
        frontend_address: {{ include "loki.queryFrontendFullname" . }}:9095

      frontend:
        log_queries_longer_than: 5s
        compress_responses: true
        tail_proxy_url: http://{{ include "loki.querierFullname" . }}:3100

      compactor:
        shared_store: filesystem

      ruler:
        storage:
          type: local
          local:
            directory: /etc/loki/rules
        ring:
          kvstore:
            store: memberlist
        rule_path: /tmp/loki/scratch
        alertmanager_url: https://alertmanager.xx
        external_url: https://alertmanager.xx

Had use the old config style to get it working:

storage_config:
    {{- toYaml .Values.loki.storageConfig | nindent 2}}
      boltdb_shipper:
        shared_store: s3
        active_index_directory: /var/loki/index
        cache_location: /var/loki/cache
        cache_ttl: 168h
        {{- if .Values.indexGateway.enabled}}
        index_gateway_client:
          server_address: dns:///{{ include "loki.indexGatewayFullname" . }}:9095
        {{- end}}
      aws:
        bucketnames: loki-chunks-bkt
        endpoint: bucketurl
        region: us-east-1
        access_key_id: <key here>
        secret_access_key: <key here>
        insecure: true
        sse_encryption: false
        http_config:
          idle_conn_timeout: 90s
          response_header_timeout: 0s
          insecure_skip_verify: true
        s3forcepathstyle: true
    {{- end}}

This worked for me