I’d like to make use of AWS S3 Intelligent Tiering. I intend to keep 2y of traces on line, but would like to only have 1 month in standard storage and the rest in a slower access tier. This was easy to setup in Loki, so looking to achieve the same solution in Tempo.
The biggest hurdle I see is that Tempo stores a file index.json.gz in the same directory (prefix) as the trace storage. This file seems to be regularly updates and S3 doesn’t provide a way to easily exclude a file from a lifecycle rule that I can see.
Is there a way to get this to work. Some options I’ve considered but can’t find a way to implement
Exclude index.json.gz from the lifecycle rule in S3. No way to do this in S3.
Tempo tags all the trace storage files. I could then setup a rule to filter on the tags, but don’t see that in Tempo at the moment.
Move the trace storage down a level in the directory hierarchy. I could then setup a lifecycle rule based on a prefix like single-tenant/traces which would ignore the index.json.gz file. This is the approach I took with Loki, which has a different file layout and it works well.
I did some quick digging and also could not find a way to do this. What does this lifecycle rule trigger off of? creation date? If so you could make a process that just deletes index.json.gz once a day or something. It will get recreated and a component will fallback to polling in the even that it does not exist. Kind of a hack, but should work.
We recently merged this capability for GCS. If you would like to take a crack at it for S3 I would be happy to review/guide you through the change.
This would be a fairly large change and I would prefer pursuing one of the above.
Lifecyle rules can only trigger of file creation date. Unfortunately not update date otherwise this would be easily solved. As you say that solution is a but hacky, but feasible.
OK I’ll have a think about that. It does seem a more elegant solution. I’m not a go programmer, but can give it a try.