Retention for different tenants not working in SimpleScalable mode

Hello everyone,
We’ve installed Loki in SimpleScalable mode connected to a MinIO instance onprem, right now we’re trying to implement data renention having multiple tenants activated.
First, we’ve made the configmap and it’s like this:

extraObjects:
 - apiVersion: v1
   kind: ConfigMap
   metadata:
     name: loki-tenant-overrides-rules
     labels:
       app.kubernetes.io/name: loki
   data:
     loki-tenant-overrides-rules.yaml: |
       overrides:
         "tenant1":
           retention_period: 60d
           retention_stream:
             - selector: '{namespace="prod"}'
               priority: 2
               period: 30d
             - selector: '{container="loki"}'
               priority: 1
               period: 15d
   
         "tenant2":
           retention_period: 24h
           retention_stream:
             - selector: '{namespace="prod"}'
               priority: 1
               period: 30d
         
         "tenant3":
           retention_period: 72h
           retention_stream:
             - selector: '{k8s.cluster.name="K8S-PROD"}'
               priority: 1
               period: 24h
         
         "tenant4":
           retention_period: 60d
           retention_stream:
             - selector: '{namespace="prod"}'
               priority: 1
               period: 30d

Then, we’ve enabled retention like this:

loki:
  ...
 
  compactor:
    working_directory: /var/loki/compactor
    compaction_interval: 10m
    retention_enabled: true
    retention_delete_delay: 2h
    retention_delete_worker_count: 150
    delete_request_store: s3

  extraVolumes:
    - name: loki-tenant-overrides-rules
      configMap:
        name: loki-tenant-overrides-rules
        items:
          - key: loki-tenant-overrides-rules.yaml
            path: loki-tenant-overrides-rules.yaml
        defaultMode: 420

  extraVolumeMounts:
    - name: loki-tenant-overrides-rules
      mountPath: /etc/loki/config/override

  limits_config:
    retention_period: 180d
    per_tenant_override_config: /etc/loki/config/override/loki-tenant-overrides-rules.yaml

ExtraVolumes and extraVolumeMounts is inserted inside Backend, Read and Write blocks as well like this:

backend:
  extraVolumes:
    - name: loki-tenant-overrides-rules
      configMap:
        name: loki-tenant-overrides-rules
        items:
          - key: loki-tenant-overrides-rules.yaml
            path: loki-tenant-overrides-rules.yaml
        defaultMode: 420
  extraVolumeMounts:
    - name: loki-tenant-overrides-rules
      mountPath: /etc/loki/config/override

But currently retention is not working at all, and in the logs of the pod for loki-backend the only references to retentions are these logs:

image

Using Grafana with the Loki datasource we can still see older data.

Any tips or informations about this issue?
Thanks in advance!

Additional info, i tried to change the global retention to 3d, and now i can see these new logs:

level=info ts=2025-01-29T14:41:04.68898398Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161664675398685
level=info ts=2025-01-29T14:41:04.841865717Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161664830459215
level=info ts=2025-01-29T14:41:04.996596267Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161664982151030
level=info ts=2025-01-29T14:41:05.151796956Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161665137797613
level=info ts=2025-01-29T14:41:05.29193463Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161665276436901
level=info ts=2025-01-29T14:41:05.443950069Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161665431913435
level=info ts=2025-01-29T14:41:05.598116587Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161665585073313
level=info ts=2025-01-29T14:41:05.745340614Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161665727416898
level=info ts=2025-01-29T14:41:05.889544376Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161665876698696
level=info ts=2025-01-29T14:41:06.035831718Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161666024056538
level=info ts=2025-01-29T14:41:06.166878941Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161666153997601
level=info ts=2025-01-29T14:41:06.284638672Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161666272582873
level=info ts=2025-01-29T14:41:06.385079794Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161666374405219
level=info ts=2025-01-29T14:56:06.422398997Z caller=expiration.go:78 msg="overall smallest retention period 1737903366.422, default smallest retention period 1737903366.422"
level=info ts=2025-01-29T14:56:06.524920866Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738162566520269991
level=info ts=2025-01-29T14:56:06.561257291Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738162566556003795
level=info ts=2025-01-29T15:11:06.422457051Z caller=expiration.go:78 msg="overall smallest retention period 1737904266.422, default smallest retention period 1737904266.422"
level=info ts=2025-01-29T15:11:06.532736938Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738163466526734583
level=info ts=2025-01-29T15:11:06.571724736Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738163466565493428

From what i gatherered, this log indicates that Loki’s compactor component is processing retention periods. The “overall smallest retention period” and “default smallest retention period” are both set to 1737903366.422 seconds (approximately 55 years) so something is wrong.

I’m no expert on this, but I think it’s more significant that 1737903366 is the
epoch timestamp for Sun Jan 26 14:56:06 GMT 2025

Antony.

1 Like

Thanks for the info Antony, checking today we see that the global retention is working, then it’s a confirmation that the per tenant one sadly is not.

Can you confirm the override configuration is actually mounted in your backend container?

1 Like

Hi,
If i check the pods for backend i can see the following:

 volumeMounts:
        - mountPath: /etc/loki/config
          name: config
        - mountPath: /etc/loki/runtime-config
          name: runtime-config
        - mountPath: /tmp
          name: tmp
        - mountPath: /var/loki
          name: data
        - mountPath: /rules
          name: sc-rules-volume
        - mountPath: /etc/loki/config/override
          name: loki-tenant-overrides-rules
        - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
          name: kube-api-access-00000
          readOnly: true
....

  volumes:
    - name: data
      persistentVolumeClaim:
        claimName: data-loki-backend-0
    - emptyDir: {}
      name: tmp
    - configMap:
        defaultMode: 420
        items:
          - key: config.yaml
            path: config.yaml
        name: loki
      name: config
    - configMap:
        defaultMode: 420
        name: loki-runtime
      name: runtime-config
    - emptyDir: {}
      name: sc-rules-volume
    - configMap:
        defaultMode: 420
        items:
          - key: loki-tenant-overrides-rules.yaml
            path: loki-tenant-overrides-rules.yaml
        name: loki-tenant-overrides-rules
      name: loki-tenant-overrides-rules

Thanks!

No, I meant actually check the container.

1 Like

Sorry, my bad.
Yes if i go into the contaier i can see the file:
image

And the content is correct:

/etc/loki/config/override $ cat loki-tenant-overrides-rules.yaml 
overrides:
  "tenant1:
    retention_period: 48h

  "tenant2":
    retention_period: 24h
  
  "tenant3":
    retention_period: 72h
    retention_stream:
      - selector: '{k8s.cluster.name="K8S-PROD"}'
        priority: 1
        period: 24h
  
  "tenant4":
    retention_period: 48h

The values are different from yesterday because i’m trying things in the meantime :slight_smile:

Interesting. I don’t see anything obviously wrong. Are you running Loki in simple scalable mode, or micro service mode? Can you also share your entire Loki configuration, please?

1 Like

Hi, i’m running Loki in Simple Scalable, i’m handling the migration from ELK stack to Grafana + MinIO / Loki stack with OpenTelemetry.
Do you think the issue could be with Simple Scalable? When we started we were using Microservice mode, but then we’ve been told to use Simple Scalable so we switched to that.

Here’s the entire yaml configuration:

global:
  dnsService: "rke2-coredns-rke2-coredns"

extraObjects:
 - apiVersion: v1
   kind: ConfigMap
   metadata:
     name: loki-tenant-overrides-rules
     labels:
       app.kubernetes.io/name: loki
   data:
     loki-tenant-overrides-rules.yaml: |
       overrides:
         "tenant1":
           retention_period: 48h
   
         "tenant2":
           retention_period: 24h
         
         "tenant3":
           retention_period: 72h
           retention_stream:
             - selector: '{k8s.cluster.name="K8S-PROD"}'
               priority: 1
               period: 24h
         
         "tenant4":
           retention_period: 48h


memcached:
  containerSecurityContext:
    seccompProfile:
      type: RuntimeDefault

memcachedExporter:
  containerSecurityContext:
    seccompProfile:
      type: RuntimeDefault

## disable canary
test:
  enabled: false
lokiCanary:
  enabled: false
ruler:
  enabled: false

loki:
  persistence:
    enabled: true
    storageClassName: "standard"
    accessModes:
      - ReadWriteOnce
    size: 20Gi
    annotations: {}

  compactor:
    working_directory: /var/loki/compactor
    compaction_interval: 10m
    retention_enabled: true
    retention_delete_delay: 2h
    retention_delete_worker_count: 150
    delete_request_store: s3

  extraVolumes:
    - name: loki-tenant-overrides-rules
      configMap:
        name: loki-tenant-overrides-rules
        items:
          - key: loki-tenant-overrides-rules.yaml
            path: loki-tenant-overrides-rules.yaml
        defaultMode: 420

  extraVolumeMounts:
    - name: loki-tenant-overrides-rules
      mountPath: /etc/loki/config/override

  distributor:
    otlp_config:
      default_resource_attributes_as_index_labels:

        - "application.environment"
        - "application.name"
        - "application.component"
        - "server.hostname"
        - "log.file.name"

        - "k8s.cluster.name"
        - "k8s.container.name"
        - "k8s.cronjob.name"
        - "k8s.job.name"
        - "k8s.namespace.name"
        - "k8s.node.name"
        - "k8s.pod.name"

  auth_enabled: true
  revisionHistoryLimit: 1
  analytics: #disable messages to stats.grafana.com
    reporting_enabled: false

  limits_config:
    retention_period: 3d
    discover_log_levels: false
    ingestion_rate_mb: 32
    ingestion_burst_size_mb: 64
    per_tenant_override_config: /etc/loki/config/override/loki-tenant-overrides-rules.yaml

  schemaConfig:
    configs:
      - from: 2024-10-01
        store: tsdb
        object_store: s3
        schema: v13
        index:
          prefix: loki_index_
          period: 24h
  ingester:
    chunk_encoding: snappy
  tracing:
    enabled: false
  querier:
    max_concurrent: 16
  storage:
    s3:
      endpoint: http://minio.loki:80
      accessKeyId: ${StorageaccessKeyId}
      secretAccessKey: ${StoragesecretAccessKey}
      s3ForcePathStyle: true
      insecure: true
      http_config:
        insecure_skip_verify: true
    bucketNames:
        chunks: "chunks"
        ruler: "ruler"
        admin: "admin"
  containerSecurityContext:
    runAsUser: 1000
    runAsGroup: 1000
    runAsNonRoot: true
    readOnlyRootFilesystem: true
    capabilities:
      drop:
        - ALL
    allowPrivilegeEscalation: false
    seccompProfile:
      type: RuntimeDefault
      
gateway:
  containerSecurityContext:
    runAsUser: 1000
    runAsGroup: 1000
    runAsNonRoot: true
    readOnlyRootFilesystem: true
    capabilities:
      drop:
        - ALL
    allowPrivilegeEscalation: false
    seccompProfile:
      type: RuntimeDefault
  ingress:
    enabled: false
  basicAuth:
    enabled: true
    username: ${basicAuthUsername}
    password: ${basicAuthPassword}

deploymentMode: SimpleScalable

ingester:
  replicas: 0
querier:
  replicas: 0
queryFrontend:
  replicas: 0
queryScheduler:
  replicas: 0
distributor:
  replicas: 0
compactor:
  replicas: 0
indexGateway:
  replicas: 0


bloomPlanner:
  replicas: 0
bloomBuilder:
  replicas: 0
bloomGateway:
  replicas: 0

minio:
  enabled: false

sidecar:
  securityContext: 
    runAsUser: 1000
    runAsGroup: 1000
    runAsNonRoot: true
    readOnlyRootFilesystem: true
    capabilities:
      drop:
        - ALL
    allowPrivilegeEscalation: false
    seccompProfile:
      type: RuntimeDefault


backend:
  extraVolumes:
    - name: loki-tenant-overrides-rules
      configMap:
        name: loki-tenant-overrides-rules
        items:
          - key: loki-tenant-overrides-rules.yaml
            path: loki-tenant-overrides-rules.yaml
        defaultMode: 420
  extraVolumeMounts:
    - name: loki-tenant-overrides-rules
      mountPath: /etc/loki/config/override
  limits_config:
    per_tenant_override_config: /etc/loki/config/override/loki-tenant-overrides-rules.yaml
  replicas: 3
  autoscaling:
    # -- Enable autoscaling for the backend.
    enabled: false
    # -- Minimum autoscaling replicas for the backend.
    minReplicas: 1
    # -- Maximum autoscaling replicas for the backend.
    maxReplicas: 3
  extraEnv:
    - name: StorageaccessKeyId
      valueFrom:
        secretKeyRef:
          name: grafana-loki-secret
          key: StorageaccessKeyId
    - name: StoragesecretAccessKey
      valueFrom:
        secretKeyRef:
          name: grafana-loki-secret
          key: StoragesecretAccessKey
  extraArgs:
    - "-config.expand-env=true"
  persistence:
    volumeClaimsEnabled: true
    size: 10Gi
    storageClass: "nfs-client"
  containerSecurityContext:
    runAsUser: 1000
    runAsGroup: 1000
    runAsNonRoot: true
    readOnlyRootFilesystem: true
    capabilities:
      drop:
        - ALL
    allowPrivilegeEscalation: false
    seccompProfile:
      type: RuntimeDefault
    
read:
  extraVolumes:
    - name: loki-tenant-overrides-rules
      configMap:
        name: loki-tenant-overrides-rules
        items:
          - key: loki-tenant-overrides-rules.yaml
            path: loki-tenant-overrides-rules.yaml
        defaultMode: 420
  extraVolumeMounts:
    - name: loki-tenant-overrides-rules
      mountPath: /etc/loki/config/override
  limits_config:
    per_tenant_override_config: /etc/loki/config/override/loki-tenant-overrides-rules.yaml
  replicas: 3
  autoscaling:
    enabled: false
    # -- Minimum autoscaling replicas for the read
    minReplicas: 1
    # -- Maximum autoscaling replicas for the read
    maxReplicas: 3
  persistence:
    volumeClaimsEnabled: true
    size: 10Gi
    storageClass: "nfs-client"
    seccompProfile:
      type: RuntimeDefault
  extraEnv:
    - name: StorageaccessKeyId
      valueFrom:
        secretKeyRef:
          name: grafana-loki-secret
          key: StorageaccessKeyId
    - name: StoragesecretAccessKey
      valueFrom:
        secretKeyRef:
          name: grafana-loki-secret
          key: StoragesecretAccessKey
  extraArgs:
    - "-config.expand-env=true"
    - "-querier.multi-tenant-queries-enabled"
write:
  extraVolumes:
    - name: loki-tenant-overrides-rules
      configMap:
        name: loki-tenant-overrides-rules
        items:
          - key: loki-tenant-overrides-rules.yaml
            path: loki-tenant-overrides-rules.yaml
        defaultMode: 420
  extraVolumeMounts:
    - name: loki-tenant-overrides-rules
      mountPath: /etc/loki/config/override
  limits_config:
    per_tenant_override_config: /etc/loki/config/override/loki-tenant-overrides-rules.yaml
  replicas: 3
  autoscaling:
    # -- Enable autoscaling for the write.
    enabled: false
    # -- Minimum autoscaling replicas for the write.
    minReplicas: 1
    # -- Maximum autoscaling replicas for the write.
    maxReplicas: 3
  persistence:
    volumeClaimsEnabled: true
    size: 10Gi
    storageClass: "nfs-client"
    seccompProfile:
      type: RuntimeDefault
  extraEnv:
    - name: StorageaccessKeyId
      valueFrom:
        secretKeyRef:
          name: grafana-loki-secret
          key: StorageaccessKeyId
    - name: StoragesecretAccessKey
      valueFrom:
        secretKeyRef:
          name: grafana-loki-secret
          key: StoragesecretAccessKey
  extraArgs:
    - "-config.expand-env=true"

singleBinary:
  replicas: 0

We use simple scalable mode, and I can assure you that’s not your problem.

Try putting your override configuration in the main Loki configuration (pick just one tenant for test purpose), and see if that works. Also try enable debug log for backend target and see what you get.

1 Like

Also, I just noticed you have 3 replicas for backend, try lowering that to 1 and see if that works. If it does, then try adding common.compactor_address to your configuration (this should be the internal address to your backend target), then increase the replica count back to 3, and see if that works.

1 Like

Hi :slight_smile:
We’ve been able to make the retention work switching to runtimeConfig like this:

loki:

...

  runtimeConfig:
    overrides:
      tenant1:
        retention_period: 48h
      tenant2:
        retention_period: 24h
      tenant3:
        retention_period: 72h
        retention_stream:
          - selector: '{k8s_cluster_name="K8S-PROD"}'
            priority: 1
            period: 24h
      tenant4:
        retention_period: 48h

  limits_config:
    retention_period: 3d
    ....


1 Like