Behavior when replacing Loki PVs or modifying replicas in simple scalable module

dzhang · January 30, 2025, 8:04pm

We’re running Loki in scalable mode with single store tsdb, and we’re looking to migrate to a new EKS cluster under the same AWS account:

what’s the behavior if the Loki PVs (loki-read, loki-write, and loki-backend) are deleted, but the underlying S3 bucket remains the same? Will we lose any data?
if we change the number of replicas for loki-read, loki-write, and loki-backend, do we need to take any additional steps to prevent loss of data?

Our Loki Helm chart config looks as follows:

loki:
  image:
    # -- The Docker registry
    registry: docker.io
    # -- Docker image repository
    repository: grafana/loki
    tag: 3.1.1
  auth_enabled: false
  storage:
    type: s3
    bucketNames:
      chunks: ${s3_bucket_name}
      ruler: ${s3_bucket_name}
      admin: ${s3_bucket_name}
    s3:
      region: ${region}
  pattern_ingester:
    enabled: true
  compactor:
    retention_enabled: true
    delete_request_store: "s3"
  schemaConfig:
    configs:
      - from: 2022-01-11
        index:
          period: 24h
          prefix: loki_ops_index_
        object_store: s3
        schema: v13
        store: tsdb

tonyswumac · January 30, 2025, 11:10pm

I don’t believe you need PVs for loki-read.
loki-write stores write-ahead logs on the PVs, so you’ll want to make sure to gracefully terminate your loki-write containers during migration by invoking /ingester/shutdown (see Loki HTTP API | Grafana Loki documentation).
loki-backend stores delete marker files on PVs. If you lose this is not a huge deal, the compactor will just miss deleting some chunks.

My understanding is you can pretty freely scale up and down loki-read and loki-backend (if you scale loki-backend you do want to make sure your rulers form a ring membership or you’ll get multiple alerts). For loki-write it’s a bit tricky when scaling down, because you do want to make sure whatever WAL data that are in the PVs are flushed first. I haven’t had to do this (we simply don’t scale down loki-write ever), but you can check out the community helm chart code and i suspect this may be taken care of already.

dzhang · January 31, 2025, 1:56am

Got it, thanks! What happens if we don’t run the shutdown? My guess is we just lose the logs that were in the process of being written, but old data left is fine (we want to automate this for many cluster so this is probably just easiest)?

tonyswumac · February 2, 2025, 5:59pm

If you don’t run shutdown you shouldn’t lose anything as long as the PVs are still around (WAL). If you intend to automate scaling of the writers I would recommend to at least have 2 replicas for ingesters.

Topic		Replies	Views
Grafana loki S3 cost spike Grafana Loki loki	6	470	September 6, 2024
Problems to migrate Logs to new Loki instance Grafana Loki loki	5	592	April 21, 2025
Queries about Loki's compactor and retention mechanism Grafana Loki loki , kubernetes , retention	7	191	February 17, 2025
Loki - Saving to S3 Grafana loki	0	203	September 22, 2024
Grafana + Loki-Distributed + Promtail on EKS Configuration loki	0	753	November 9, 2023

Behavior when replacing Loki PVs or modifying replicas in simple scalable module

Related topics