Need to combine two separate traces under the same trace_id

I’m currently working on a project using OpenTelemetry to instrument FastAPI services, and we’re using Grafana Tempo as the tracing backend. I have a scenario where I need to combine two separate traces under the same trace_id or represent them as a single trace in Grafana.

I’ve gone through several resources, including:

How to use the OpenTelemetry Collector to transform and manipulate traces before sending them to Tempo? I’m still unsure about the best way to merge two separate traces under the same trace_id using a transformation pipeline

Specifically, I would appreciate insights on:
• Whether it is feasible to configure a transformation pipeline in Grafana Tempo or the OpenTelemetry Collector to handle merging traces at the collector level.
• Any existing mechanisms or best practices for joining traces or managing context propagation across multiple requests that might help solve this problem.

I’m not traces master but I as far as I know, you don’t need to do anything. What do you mean by separate traces? That they are produced by different applications? Traces are connected by the same trace_id, so when you’ll query in Tempo, it will be displayed. Also, each span (aside from the main span) should have a parent_id, which creates the trace structure.

By separate traces, I mean two distinct spans produced by different HTTP requests (in the same service) that naturally generate their own trace IDs. What I’m trying to achieve is to manually link these spans by making them share the same trace_id, or to have them appear under the same trace structure in Grafana.

Would you know if there’s a way to force this through some configuration in the OpenTelemetry Collector or if there’s another way?

I don’t think that’s possible (doesn’t mean it isn’t though). Since by definition:

Distributed tracing lets you observe requests as they propagate through complex, distributed systems

I take it that one request = one trace. What is your use case for that? Linking two different HTTP requests seems a bit wild to me tbh :smile: especially considering you could have situations like only one of the requests was executed, etc.

Finally answering your question, if you are sure that those two requests are always together or something like that, you could create a proxy that would originate the trace (both of the traces from subservices would be children of the proxy’s span).

In distributed tracing, a single trace can consist of multiple requests, with each request being represented by its own span. For example, when a request goes through multiple services, all the spans from those services share the same trace_id but have different span_ids, which together form a complete trace.

In my case, both HTTP requests are part of the same logical transaction—they always happen together as part of a workflow (one triggers an action, and the other checks its status). That’s why it would make sense to have them under the same trace for observability purposes.

Correct. So configure your instrumentationa with proper context propagation, so your spans will share the same trace_id (because it is single transaction). How do you want to reconcile traces outside of instrumentation?

Apologies for the confusion. To clarify, these are two separate traces, but I’d like to explicitly show them (or the spans within them) under the same trace_id.

Currently, context propagation is handled automatically through OpenTelemetry instrumentation in FastAPI and httpx, but since these are distinct traces, I’m looking for a way to merge them under one trace_id. Is this possible via a custom processor in the OpenTelemetry Collector, or perhaps directly in Grafana?

How do you want to do that?

You have billions traces. How anyone (e. g. collector) will know which 2 traces, should be stiched together? Will you maintain mapping table where you will writing these ids explicitly? What if you have millions transactions? Will you maintain config with 2 millions trace ids?
These assumptions should give you that you are doing something wrong in your code.

Your services should work with context propagation properly = they will use the same trace id for the same transaction.

See some doc. E. g.

Since they are two parts of the same transaction, should they be two different HTTP requests? That aside, how your users / you execute the HTTP requests? If it’s anything other than curl I think it could pass context (kong / nginx / frontend can generate the trace_id and pass it in some header (I think that’s how the context is propagated)).

I was working on a similar problem of joining two traces with distinct trace_ids.

My usecase was like this:

  1. A user makes HTTP request on /create_job that creates a job. Suppose, job_id=1, trace_id=1, span_id=1
  2. The job is then sent to an agent running on client side
  3. After finishing the job, agent makes HTTP request on same microservice but a different route /submit_status to submit a status message. Suppose, job_id=1, trace_id=2, span_id=2

As a workaround, I was able to build custom dashboard with a search bar taking job_id as input, which returns a table of 2 rows trace_id=1 and trace_id=2. Upon clicking a trace_id, I render spans of that trace_id, in Traces visualization, in a side panel. It works great.

Ideally, I would prefer to see one trace_id with 2 spans. This trace should represent job creation and status.

Any help on how to do this?