I’m running Tempo 2.3.1 and Grafana 10.3.1. We’ve got a relatively small environment and so wondering what the minimum requirements to size up the various tempo components are. We have:
Our set up is:
Grafana, Tempo (micro-service mode with 2 grafanas, 1 compactor, 1 distributor, 3 ingesters, 1 memchached, 1 metrics generator, 1 querier, 1 query front-end, 2 prometheus, 3 loki
Also, general guidance as to whether what we’re seeing makes sense.
- We have roughly 500-1000 spans received (tempo_distributor_spans_received_total)
- Around 150 traces received.
- The Otel Distributor Spans/second shows roughly 600-800 spans/s (tempo_receiver_accepted_spans)
- We’re seeing OOMs quite a bit on our ingesters and on the compactor.
- You can see our Go heap for compactor and ingesters run out of memory reasonably frequently. (e.g. the sudden increase which then keels over)
Any advice on right-sizing? and any other metrics we can look at to identify what is causing the sudden memory increase and the follow on OOM, rather than just a punt in the dark…