Hi, I deployed my Tempo with S3, I also set up its monitoring. But when I query traces, click one trace, sometimes I could get its spans, sometimes I can’t. I error shows up is:
Hi @surajsidh, thanks for your response, I had followed the docs, but my error logs seem not one of the cases. I am sure my distributor failed to push batches, the reason is due to connection refused, not hit the rate limitation. I have no “TRACE_TOO_LARGE/LIVE_TRACES_EXCEEDED/RATE_LIMITED” such logs.
I do have some logs like this: {"caller":"rate_limited_logger.go:27","err":"rpc error: code = Unknown desc = Ingester is shutting down","level":"error","msg":"pusher failed to consume trace data","ts":"2023-05-17T13:47:06.786529262Z"}
By the way my distributor is not using pvc, but my ingestor is, is that okay?
If when the distributor received the spans, wants to transport to ingester, but at this time, the ingester is shutting down, will the spans still be able to send to the alive ingester?
we recommend replication factor of 3, you can have data loss with replication factor of 1.
{"caller":"rate_limited_logger.go:27","err":"rpc error: code = Unknown desc = Ingester is shutting down","level":"error","msg":"pusher failed to consume trace data","ts":"2023-05-17T13:47:06.786529262Z"}
This means that your Ingester is shutting down, and it’s possible that your distributor is hitting an ingester that has gone away.