Hello everyone,
We’ve installed Loki in SimpleScalable mode connected to a MinIO instance onprem, right now we’re trying to implement data renention having multiple tenants activated.
First, we’ve made the configmap and it’s like this:
extraObjects:
- apiVersion: v1
kind: ConfigMap
metadata:
name: loki-tenant-overrides-rules
labels:
app.kubernetes.io/name: loki
data:
loki-tenant-overrides-rules.yaml: |
overrides:
"tenant1":
retention_period: 60d
retention_stream:
- selector: '{namespace="prod"}'
priority: 2
period: 30d
- selector: '{container="loki"}'
priority: 1
period: 15d
"tenant2":
retention_period: 24h
retention_stream:
- selector: '{namespace="prod"}'
priority: 1
period: 30d
"tenant3":
retention_period: 72h
retention_stream:
- selector: '{k8s.cluster.name="K8S-PROD"}'
priority: 1
period: 24h
"tenant4":
retention_period: 60d
retention_stream:
- selector: '{namespace="prod"}'
priority: 1
period: 30d
Then, we’ve enabled retention like this:
loki:
...
compactor:
working_directory: /var/loki/compactor
compaction_interval: 10m
retention_enabled: true
retention_delete_delay: 2h
retention_delete_worker_count: 150
delete_request_store: s3
extraVolumes:
- name: loki-tenant-overrides-rules
configMap:
name: loki-tenant-overrides-rules
items:
- key: loki-tenant-overrides-rules.yaml
path: loki-tenant-overrides-rules.yaml
defaultMode: 420
extraVolumeMounts:
- name: loki-tenant-overrides-rules
mountPath: /etc/loki/config/override
limits_config:
retention_period: 180d
per_tenant_override_config: /etc/loki/config/override/loki-tenant-overrides-rules.yaml
ExtraVolumes and extraVolumeMounts is inserted inside Backend, Read and Write blocks as well like this:
backend:
extraVolumes:
- name: loki-tenant-overrides-rules
configMap:
name: loki-tenant-overrides-rules
items:
- key: loki-tenant-overrides-rules.yaml
path: loki-tenant-overrides-rules.yaml
defaultMode: 420
extraVolumeMounts:
- name: loki-tenant-overrides-rules
mountPath: /etc/loki/config/override
But currently retention is not working at all, and in the logs of the pod for loki-backend the only references to retentions are these logs:
Using Grafana with the Loki datasource we can still see older data.
Any tips or informations about this issue?
Thanks in advance!
Additional info, i tried to change the global retention to 3d, and now i can see these new logs:
level=info ts=2025-01-29T14:41:04.68898398Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161664675398685
level=info ts=2025-01-29T14:41:04.841865717Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161664830459215
level=info ts=2025-01-29T14:41:04.996596267Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161664982151030
level=info ts=2025-01-29T14:41:05.151796956Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161665137797613
level=info ts=2025-01-29T14:41:05.29193463Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161665276436901
level=info ts=2025-01-29T14:41:05.443950069Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161665431913435
level=info ts=2025-01-29T14:41:05.598116587Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161665585073313
level=info ts=2025-01-29T14:41:05.745340614Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161665727416898
level=info ts=2025-01-29T14:41:05.889544376Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161665876698696
level=info ts=2025-01-29T14:41:06.035831718Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161666024056538
level=info ts=2025-01-29T14:41:06.166878941Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161666153997601
level=info ts=2025-01-29T14:41:06.284638672Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161666272582873
level=info ts=2025-01-29T14:41:06.385079794Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738161666374405219
level=info ts=2025-01-29T14:56:06.422398997Z caller=expiration.go:78 msg="overall smallest retention period 1737903366.422, default smallest retention period 1737903366.422"
level=info ts=2025-01-29T14:56:06.524920866Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738162566520269991
level=info ts=2025-01-29T14:56:06.561257291Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738162566556003795
level=info ts=2025-01-29T15:11:06.422457051Z caller=expiration.go:78 msg="overall smallest retention period 1737904266.422, default smallest retention period 1737904266.422"
level=info ts=2025-01-29T15:11:06.532736938Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738163466526734583
level=info ts=2025-01-29T15:11:06.571724736Z caller=marker.go:77 msg="mark file created" file=/var/loki/compactor/retention/s3_2024-10-01/markers/1738163466565493428
From what i gatherered, this log indicates that Loki’s compactor component is processing retention periods. The “overall smallest retention period” and “default smallest retention period” are both set to 1737903366.422 seconds (approximately 55 years) so something is wrong.
pooh
January 29, 2025, 3:55pm
3
I’m no expert on this, but I think it’s more significant that 1737903366 is the
epoch timestamp for Sun Jan 26 14:56:06 GMT 2025
Antony.
1 Like
Thanks for the info Antony, checking today we see that the global retention is working, then it’s a confirmation that the per tenant one sadly is not.
Can you confirm the override configuration is actually mounted in your backend container?
1 Like
Hi,
If i check the pods for backend i can see the following:
volumeMounts:
- mountPath: /etc/loki/config
name: config
- mountPath: /etc/loki/runtime-config
name: runtime-config
- mountPath: /tmp
name: tmp
- mountPath: /var/loki
name: data
- mountPath: /rules
name: sc-rules-volume
- mountPath: /etc/loki/config/override
name: loki-tenant-overrides-rules
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-00000
readOnly: true
....
volumes:
- name: data
persistentVolumeClaim:
claimName: data-loki-backend-0
- emptyDir: {}
name: tmp
- configMap:
defaultMode: 420
items:
- key: config.yaml
path: config.yaml
name: loki
name: config
- configMap:
defaultMode: 420
name: loki-runtime
name: runtime-config
- emptyDir: {}
name: sc-rules-volume
- configMap:
defaultMode: 420
items:
- key: loki-tenant-overrides-rules.yaml
path: loki-tenant-overrides-rules.yaml
name: loki-tenant-overrides-rules
name: loki-tenant-overrides-rules
Thanks!
No, I meant actually check the container.
1 Like
Sorry, my bad.
Yes if i go into the contaier i can see the file:
And the content is correct:
/etc/loki/config/override $ cat loki-tenant-overrides-rules.yaml
overrides:
"tenant1:
retention_period: 48h
"tenant2":
retention_period: 24h
"tenant3":
retention_period: 72h
retention_stream:
- selector: '{k8s.cluster.name="K8S-PROD"}'
priority: 1
period: 24h
"tenant4":
retention_period: 48h
The values are different from yesterday because i’m trying things in the meantime
Interesting. I don’t see anything obviously wrong. Are you running Loki in simple scalable mode, or micro service mode? Can you also share your entire Loki configuration, please?
1 Like
porabot
January 31, 2025, 8:40am
10
Hi, i’m running Loki in Simple Scalable, i’m handling the migration from ELK stack to Grafana + MinIO / Loki stack with OpenTelemetry.
Do you think the issue could be with Simple Scalable? When we started we were using Microservice mode, but then we’ve been told to use Simple Scalable so we switched to that.
Here’s the entire yaml configuration:
global:
dnsService: "rke2-coredns-rke2-coredns"
extraObjects:
- apiVersion: v1
kind: ConfigMap
metadata:
name: loki-tenant-overrides-rules
labels:
app.kubernetes.io/name: loki
data:
loki-tenant-overrides-rules.yaml: |
overrides:
"tenant1":
retention_period: 48h
"tenant2":
retention_period: 24h
"tenant3":
retention_period: 72h
retention_stream:
- selector: '{k8s.cluster.name="K8S-PROD"}'
priority: 1
period: 24h
"tenant4":
retention_period: 48h
memcached:
containerSecurityContext:
seccompProfile:
type: RuntimeDefault
memcachedExporter:
containerSecurityContext:
seccompProfile:
type: RuntimeDefault
## disable canary
test:
enabled: false
lokiCanary:
enabled: false
ruler:
enabled: false
loki:
persistence:
enabled: true
storageClassName: "standard"
accessModes:
- ReadWriteOnce
size: 20Gi
annotations: {}
compactor:
working_directory: /var/loki/compactor
compaction_interval: 10m
retention_enabled: true
retention_delete_delay: 2h
retention_delete_worker_count: 150
delete_request_store: s3
extraVolumes:
- name: loki-tenant-overrides-rules
configMap:
name: loki-tenant-overrides-rules
items:
- key: loki-tenant-overrides-rules.yaml
path: loki-tenant-overrides-rules.yaml
defaultMode: 420
extraVolumeMounts:
- name: loki-tenant-overrides-rules
mountPath: /etc/loki/config/override
distributor:
otlp_config:
default_resource_attributes_as_index_labels:
- "application.environment"
- "application.name"
- "application.component"
- "server.hostname"
- "log.file.name"
- "k8s.cluster.name"
- "k8s.container.name"
- "k8s.cronjob.name"
- "k8s.job.name"
- "k8s.namespace.name"
- "k8s.node.name"
- "k8s.pod.name"
auth_enabled: true
revisionHistoryLimit: 1
analytics: #disable messages to stats.grafana.com
reporting_enabled: false
limits_config:
retention_period: 3d
discover_log_levels: false
ingestion_rate_mb: 32
ingestion_burst_size_mb: 64
per_tenant_override_config: /etc/loki/config/override/loki-tenant-overrides-rules.yaml
schemaConfig:
configs:
- from: 2024-10-01
store: tsdb
object_store: s3
schema: v13
index:
prefix: loki_index_
period: 24h
ingester:
chunk_encoding: snappy
tracing:
enabled: false
querier:
max_concurrent: 16
storage:
s3:
endpoint: http://minio.loki:80
accessKeyId: ${StorageaccessKeyId}
secretAccessKey: ${StoragesecretAccessKey}
s3ForcePathStyle: true
insecure: true
http_config:
insecure_skip_verify: true
bucketNames:
chunks: "chunks"
ruler: "ruler"
admin: "admin"
containerSecurityContext:
runAsUser: 1000
runAsGroup: 1000
runAsNonRoot: true
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
allowPrivilegeEscalation: false
seccompProfile:
type: RuntimeDefault
gateway:
containerSecurityContext:
runAsUser: 1000
runAsGroup: 1000
runAsNonRoot: true
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
allowPrivilegeEscalation: false
seccompProfile:
type: RuntimeDefault
ingress:
enabled: false
basicAuth:
enabled: true
username: ${basicAuthUsername}
password: ${basicAuthPassword}
deploymentMode: SimpleScalable
ingester:
replicas: 0
querier:
replicas: 0
queryFrontend:
replicas: 0
queryScheduler:
replicas: 0
distributor:
replicas: 0
compactor:
replicas: 0
indexGateway:
replicas: 0
bloomPlanner:
replicas: 0
bloomBuilder:
replicas: 0
bloomGateway:
replicas: 0
minio:
enabled: false
sidecar:
securityContext:
runAsUser: 1000
runAsGroup: 1000
runAsNonRoot: true
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
allowPrivilegeEscalation: false
seccompProfile:
type: RuntimeDefault
backend:
extraVolumes:
- name: loki-tenant-overrides-rules
configMap:
name: loki-tenant-overrides-rules
items:
- key: loki-tenant-overrides-rules.yaml
path: loki-tenant-overrides-rules.yaml
defaultMode: 420
extraVolumeMounts:
- name: loki-tenant-overrides-rules
mountPath: /etc/loki/config/override
limits_config:
per_tenant_override_config: /etc/loki/config/override/loki-tenant-overrides-rules.yaml
replicas: 3
autoscaling:
# -- Enable autoscaling for the backend.
enabled: false
# -- Minimum autoscaling replicas for the backend.
minReplicas: 1
# -- Maximum autoscaling replicas for the backend.
maxReplicas: 3
extraEnv:
- name: StorageaccessKeyId
valueFrom:
secretKeyRef:
name: grafana-loki-secret
key: StorageaccessKeyId
- name: StoragesecretAccessKey
valueFrom:
secretKeyRef:
name: grafana-loki-secret
key: StoragesecretAccessKey
extraArgs:
- "-config.expand-env=true"
persistence:
volumeClaimsEnabled: true
size: 10Gi
storageClass: "nfs-client"
containerSecurityContext:
runAsUser: 1000
runAsGroup: 1000
runAsNonRoot: true
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
allowPrivilegeEscalation: false
seccompProfile:
type: RuntimeDefault
read:
extraVolumes:
- name: loki-tenant-overrides-rules
configMap:
name: loki-tenant-overrides-rules
items:
- key: loki-tenant-overrides-rules.yaml
path: loki-tenant-overrides-rules.yaml
defaultMode: 420
extraVolumeMounts:
- name: loki-tenant-overrides-rules
mountPath: /etc/loki/config/override
limits_config:
per_tenant_override_config: /etc/loki/config/override/loki-tenant-overrides-rules.yaml
replicas: 3
autoscaling:
enabled: false
# -- Minimum autoscaling replicas for the read
minReplicas: 1
# -- Maximum autoscaling replicas for the read
maxReplicas: 3
persistence:
volumeClaimsEnabled: true
size: 10Gi
storageClass: "nfs-client"
seccompProfile:
type: RuntimeDefault
extraEnv:
- name: StorageaccessKeyId
valueFrom:
secretKeyRef:
name: grafana-loki-secret
key: StorageaccessKeyId
- name: StoragesecretAccessKey
valueFrom:
secretKeyRef:
name: grafana-loki-secret
key: StoragesecretAccessKey
extraArgs:
- "-config.expand-env=true"
- "-querier.multi-tenant-queries-enabled"
write:
extraVolumes:
- name: loki-tenant-overrides-rules
configMap:
name: loki-tenant-overrides-rules
items:
- key: loki-tenant-overrides-rules.yaml
path: loki-tenant-overrides-rules.yaml
defaultMode: 420
extraVolumeMounts:
- name: loki-tenant-overrides-rules
mountPath: /etc/loki/config/override
limits_config:
per_tenant_override_config: /etc/loki/config/override/loki-tenant-overrides-rules.yaml
replicas: 3
autoscaling:
# -- Enable autoscaling for the write.
enabled: false
# -- Minimum autoscaling replicas for the write.
minReplicas: 1
# -- Maximum autoscaling replicas for the write.
maxReplicas: 3
persistence:
volumeClaimsEnabled: true
size: 10Gi
storageClass: "nfs-client"
seccompProfile:
type: RuntimeDefault
extraEnv:
- name: StorageaccessKeyId
valueFrom:
secretKeyRef:
name: grafana-loki-secret
key: StorageaccessKeyId
- name: StoragesecretAccessKey
valueFrom:
secretKeyRef:
name: grafana-loki-secret
key: StoragesecretAccessKey
extraArgs:
- "-config.expand-env=true"
singleBinary:
replicas: 0
We use simple scalable mode, and I can assure you that’s not your problem.
Try putting your override configuration in the main Loki configuration (pick just one tenant for test purpose), and see if that works. Also try enable debug log for backend
target and see what you get.
1 Like
Also, I just noticed you have 3 replicas for backend, try lowering that to 1 and see if that works. If it does, then try adding common.compactor_address
to your configuration (this should be the internal address to your backend target), then increase the replica count back to 3, and see if that works.
1 Like
porabot
February 5, 2025, 1:27pm
13
Hi
We’ve been able to make the retention work switching to runtimeConfig like this:
loki:
...
runtimeConfig:
overrides:
tenant1:
retention_period: 48h
tenant2:
retention_period: 24h
tenant3:
retention_period: 72h
retention_stream:
- selector: '{k8s_cluster_name="K8S-PROD"}'
priority: 1
period: 24h
tenant4:
retention_period: 48h
limits_config:
retention_period: 3d
....
1 Like