Best Practices and Built-In Support for Loki's S3 Smart Tiering and Log Retrieval

Hello Grafana Community,

I am currently working on implementing smart tiering for Loki’s S3 bucket in my environment and plan to send data to the Glacier archive after 60 days. I am seeking advice on best practices for retrieving the logs after they have been archived.

I’ve encountered an issue when pulling the index object from the archive, receiving an error stating:

err="failed to get s3 object: InvalidObjectState: The operation is not valid for the object's storage class\n\tstatus code: 403

This seems to occur because the storage class doesn’t change after pulling. To work around this, I have to re-upload the index back into the bucket. However, this issue pertains only to the index object.

When it comes to the log stream objects in the fake/ folder, I am uncertain how to efficiently pull them all and re-upload. There are a substantial number of files, and I’m unsure of the best strategy to tackle this.

Therefore, I am reaching out to ask if there are any best practices, tips, or even built-in features within Loki to support smart tiering and efficient log retrieval from the archive. Any insights or experiences shared will be greatly appreciated.

Looking forward to your responses.

I have not tried this, but do you have another retention policy to delete chunk files after certain period of time in glacier? If so, perhaps it might be acceptable to not put index files into glaicer since they are a lot smaller, and instead remove them later together with the chunk files (assuming they do get removed at some point).