Need to combine two separate traces under the same trace_id

adityawdubey · October 17, 2024, 9:15pm

I’m currently working on a project using OpenTelemetry to instrument FastAPI services, and we’re using Grafana Tempo as the tracing backend. I have a scenario where I need to combine two separate traces under the same trace_id or represent them as a single trace in Grafana.

I’ve gone through several resources, including:

How to use the OpenTelemetry Collector to transform and manipulate traces before sending them to Tempo? I’m still unsure about the best way to merge two separate traces under the same trace_id using a transformation pipeline

Specifically, I would appreciate insights on:
• Whether it is feasible to configure a transformation pipeline in Grafana Tempo or the OpenTelemetry Collector to handle merging traces at the collector level.
• Any existing mechanisms or best practices for joining traces or managing context propagation across multiple requests that might help solve this problem.

dawiddebowski · October 18, 2024, 4:58am

I’m not traces master but I as far as I know, you don’t need to do anything. What do you mean by separate traces? That they are produced by different applications? Traces are connected by the same trace_id, so when you’ll query in Tempo, it will be displayed. Also, each span (aside from the main span) should have a parent_id, which creates the trace structure.

adityawdubey · October 18, 2024, 7:06am

By separate traces, I mean two distinct spans produced by different HTTP requests (in the same service) that naturally generate their own trace IDs. What I’m trying to achieve is to manually link these spans by making them share the same trace_id, or to have them appear under the same trace structure in Grafana.

Would you know if there’s a way to force this through some configuration in the OpenTelemetry Collector or if there’s another way?

dawiddebowski · October 18, 2024, 1:04pm

I don’t think that’s possible (doesn’t mean it isn’t though). Since by definition:

Distributed tracing lets you observe requests as they propagate through complex, distributed systems

I take it that one request = one trace. What is your use case for that? Linking two different HTTP requests seems a bit wild to me tbh especially considering you could have situations like only one of the requests was executed, etc.

Finally answering your question, if you are sure that those two requests are always together or something like that, you could create a proxy that would originate the trace (both of the traces from subservices would be children of the proxy’s span).

adityawdubey · October 18, 2024, 2:51pm

In distributed tracing, a single trace can consist of multiple requests, with each request being represented by its own span. For example, when a request goes through multiple services, all the spans from those services share the same trace_id but have different span_ids, which together form a complete trace.

In my case, both HTTP requests are part of the same logical transaction—they always happen together as part of a workflow (one triggers an action, and the other checks its status). That’s why it would make sense to have them under the same trace for observability purposes.

jangaraj · October 18, 2024, 3:15pm

Correct. So configure your instrumentationa with proper context propagation, so your spans will share the same trace_id (because it is single transaction). How do you want to reconcile traces outside of instrumentation?

adityawdubey · October 18, 2024, 3:42pm

Apologies for the confusion. To clarify, these are two separate traces, but I’d like to explicitly show them (or the spans within them) under the same trace_id.

Currently, context propagation is handled automatically through OpenTelemetry instrumentation in FastAPI and httpx, but since these are distinct traces, I’m looking for a way to merge them under one trace_id. Is this possible via a custom processor in the OpenTelemetry Collector, or perhaps directly in Grafana?

jangaraj · October 18, 2024, 4:03pm

How do you want to do that?

You have billions traces. How anyone (e. g. collector) will know which 2 traces, should be stiched together? Will you maintain mapping table where you will writing these ids explicitly? What if you have millions transactions? Will you maintain config with 2 millions trace ids?
These assumptions should give you that you are doing something wrong in your code.

Your services should work with context propagation properly = they will use the same trace id for the same transaction.

See some doc. E. g.

dawiddebowski · October 18, 2024, 4:24pm

Since they are two parts of the same transaction, should they be two different HTTP requests? That aside, how your users / you execute the HTTP requests? If it’s anything other than curl I think it could pass context (kong / nginx / frontend can generate the trace_id and pass it in some header (I think that’s how the context is propagated)).

vedavyasy · October 18, 2024, 8:34pm

I was working on a similar problem of joining two traces with distinct trace_ids.

My usecase was like this:

A user makes HTTP request on /create_job that creates a job. Suppose, job_id=1, trace_id=1, span_id=1
The job is then sent to an agent running on client side
After finishing the job, agent makes HTTP request on same microservice but a different route /submit_status to submit a status message. Suppose, job_id=1, trace_id=2, span_id=2

As a workaround, I was able to build custom dashboard with a search bar taking job_id as input, which returns a table of 2 rows trace_id=1 and trace_id=2. Upon clicking a trace_id, I render spans of that trace_id, in Traces visualization, in a side panel. It works great.

Ideally, I would prefer to see one trace_id with 2 spans. This trace should represent job creation and status.

Any help on how to do this?

Topic		Replies	Views
Inconsistent link between traces and logs Grafana Tempo	7	151	July 16, 2024
OpenTelemetry: linking log trace and metrics with TraceID not in log line Configuration	0	40	August 29, 2024
How to create summary of traces without needing to open subspans? Grafana Tempo	2	103	July 1, 2024
Grafana Tempo is not showing traces Grafana Tempo kubernetes , agent , tempo , opentelemetry , traces	5	1701	March 5, 2024
How to create a Grafana Dashboard for my OpenTelemetry instruments > Collector > Tempo setup Grafana dashboard , tempo , opentelemetry	4	1254	May 28, 2024

Need to combine two separate traces under the same trace_id

Related topics