Missing traces in Tempo

Hi, I deployed my Tempo with S3, I also set up its monitoring. But when I query traces, click one trace, sometimes I could get its spans, sometimes I can’t. I error shows up is:


I followed the trouble shooting page, and found my tempo_receiver_refused_spans is not 0.

I found my tempo-distributor logs like this:

{"caller":"rate_limited_logger.go:27","err":"rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp xx.xx.xx.xx:9095: connect: connection refused\"","level":"error","msg":"pusher failed to consume trace data","ts":"2023-05-16T19:15:16.237465783Z"}

Any suggestion to fix it?

can you follow Unable to find traces | Grafana Tempo documentation and Distributor refusing spans | Grafana Tempo documentation guides.

if issue is still around after following these guides, can you share detailed logs, tempo config, and other relevent config.

1 Like

Hi @surajsidh, thanks for your response, I had followed the docs, but my error logs seem not one of the cases. I am sure my distributor failed to push batches, the reason is due to connection refused, not hit the rate limitation. I have no “TRACE_TOO_LARGE/LIVE_TRACES_EXCEEDED/RATE_LIMITED” such logs.

I do have some logs like this:
{"caller":"rate_limited_logger.go:27","err":"rpc error: code = Unknown desc = Ingester is shutting down","level":"error","msg":"pusher failed to consume trace data","ts":"2023-05-17T13:47:06.786529262Z"}

By the way my distributor is not using pvc, but my ingestor is, is that okay?

If when the distributor received the spans, wants to transport to ingester, but at this time, the ingester is shutting down, will the spans still be able to send to the alive ingester?

I set ingester.config.replication_factor as 1, is that okay?

we recommend replication factor of 3, you can have data loss with replication factor of 1.

{"caller":"rate_limited_logger.go:27","err":"rpc error: code = Unknown desc = Ingester is shutting down","level":"error","msg":"pusher failed to consume trace data","ts":"2023-05-17T13:47:06.786529262Z"}
This means that your Ingester is shutting down, and it’s possible that your distributor is hitting an ingester that has gone away.

checkout examples, to see different examples of deploying tempo, and see Set up | Grafana Tempo documentation and Manage | Grafana Tempo documentation pages for our recommendations on deployment.