Service hangs when tracing is enabled


I have a very odd issue in one of my API services. When tracing is enabled the service hangs and stops working. Not sure how to debug the issue as I have other services using the same tracing code and they work well.
The services run in Kubernetes using istio service mesh. Tempo distributed is deployed in the cluster. The tracing is based on zipkin/opencensus.

endpoint = zipkin.Endpoint(servicename, iphost:port). // tried different ports here with no difference
reporter = zipkin.NewReporter(temp_url)
exporter = zipkin.NewExporter(reporter, endpoint)

I have checked the code and couldn’t find any problem as the code is the same as the other API services that have no problems. The containers are based on the same base image and there doesn’t seem to be any restrictions on outbound communication. The issue starts showing up when tracing is enable i.e trace.ApplyConfig(trace.Config{DefaultSampler: trace.NeverAlways()}) and the code tries to send traces to the tempo-distributor. There doesn’t seem to be communication between the app and tempo distributer as looking at the logs there is no sign of the distributer getting any tracing.

Has anyone experienced similar issues? Any pointers or ideas of how to debug it? Workarounds?.


This issue has been resolved.
The problem was on our end. There was some refactoring to our codebase and the tracing setup was moved out from the main into a separate function causing the tracer to close on exit. Zipkin reporter uses channels to send spans to a go routine and this send to the backend, in our case, tempo. Because the tracer was previously closed there was no receiver to the span channel and hence the reason of the hanging as this is a blocking operation.