Posting delete's to logs in loki can take weeks to complete

Hi community. I am looking for a reliable way to delete logs that may contain PII that make their way into loki. I can make a POST request to the api/v1/delete endpoint and note that a delete request will stay pending for sometime, then go to processed state. However, this can take a long time (weeks event) to fully delete the specified logs. We do have 3 loki backend pods and from checking the logs in each, I can indeed see one of them is actually running the compactor.

The system receives a large volume of logs.

From what I can understand from reading the docs, log deletion requests are handled when the compactor runs over a chunk. Does this mean if, say, I put in a request to delete logs from between 1-2 months ago that match a given query, since compactor will have already ran over those chunks, those logs will not be deleted? Or why does it take weeks to note the logs are actually deleted?

Is there some operation I can do to “force” compactor to run and then delete matching logs lines? This is required because we need a way to reliably scrub any sensitive logs that might make it into Loki, and we need to be able to delete old logs if we later discover them.

SimpleScalable deployment mode.

Also, I note very weird behavior - If I curl the loki backend gateway for delete requests, it does not show all my delete requests. If I port-forward each backend pod and then curl the delete requests, I get different results. Does the delete requests not use a shared source? We set delete_request_store:s3 so it should use that, right?

We have retention_period: 0s set.

Compactor config in helm chart:

    compactor:
      retention_enabled: true
      working_directory: /tmp/loki/compactor
      delete_request_store: s3
      delete_request_cancel_period: 2m
      retention_delete_delay: 1m
      # Add coordination settings for multi-pod deployments
      compaction_interval: 10m

I don’t really use the delete function, so I could be wrong. My understanding is that the delete request is processed once the delete_request_cancel_period has passed.

I would also double check and make sure deletion_mode is configured as filter-and-delete.

delete_request_cancel_period is set to 2 minutes, so that’s not the issue here, and I did forget to attach it, but we do have this:

  runtimeConfig:
    deletion_mode: filter-and-delete
 

And logs do delete, it just takes weeks or months sometimes.