Hey Team,
New to tempo, tried installation but stuck at this…
Need help to debug more about why our counter keeps on increasing for “tempo_distributor_ingester_append_failures_total” metric.
We tried exploring the Tempo - ingester and distributor logs, but we were only getting the following messgaes:
“level=warn ts=2023-07-19T20:15:30.129704983Z caller=tcp_transport.go:252 component=“memberlist TCPTransport” msg=“failed to read message type” err=EOF remote=10.111.132.31:56234”
The total append sent vs failure rate is around 20%
Over google or on Grafana docs, I was not able to find what this metrics means and how can we dig more, to figure out what is going wrong?
We are currently using TEMPO-DISTRIBUTED helm chart (version version=2.1.1) to deploy tempo to our EKS Cluster
Logs for distributor and metrics:
We did found something about override and tried that too:
Overrides as per /status/config page:
Overrides configured in helm