Is it possible to use the same bucket by different Tempo deployments in different clusters?
Let’s imagine that we have Kubernetes clusters A and B. In both clusters, we have Tempo deployments that are configured to the same bucket. And we have the same service deployed in both clusters. So the question is it possible somehow to distinguish traces from the service in cluster A and service in cluster B on Grafana UI or in any query.
In Prometheus/Thanos it is possible with External Labels which can represent cluster name and in this case metrics from the same service but different clusters will have different values for the external label. Is there a way to do the same for Tempo?
I feel like I’m seeing two questions here:
- Can two Tempo deployments share the same bucket?
Technically yes, but it will require some clever config. There may also be some rough corners that would require minor code changes to get this 100%. The two things that would have to be configured for this to work properly:
- Compactors from both clusters would need to share a ring so that they shard jobs correctly. In fact it may make more sense to just scale compactors in one cluster to 0 and only have one cluster responsible for compaction.
- Queriers would have to see the ingesters from both clusters to retrieve recent traces correctly.
The one thing I’m concerned about is if a trace exists in ingesters in both clusters. I think the queriers would see
(replication_factor / 2) + 1 ingesters return success and would return early.
- Can a trace contain labels that distinguish what cluster it came from?
Tempo proper does not have the capability to add tags to spans, but both the OTel Collector and the Grafana Agent have this capability. So you could add appropriate tags in your trace pipeline with either of these tools that included a cluster tag.
Thanks for the response @joeelliott