Latency monitors - getting started

Greetings.

Goal: advice and direction getting started with application latency monitors.

Environment: Grafana 10.4.1 running in Openshift (4.1x).

Current initiative: my company beginning to migrate production apps from VM-based containers into Openshift (kubernetes).

About me: monitoring guy. Experienced Solarwinds admin. New to Prometheus / PromQL / Grafana.

Experience: I’ve made some dashboards with time series display for Pod info / restarts, storage information & usage, some network-centric i/o visualizations with container_network_receive/send_bytes_total.

What I’d like to do: build application latency visualizations, which I think require use of histograms. Right now histograms are a black box to me - totally new.

I don’t mind doing some reading and self-study.

Any advice from the community would be appreciated - thanks!

I would use Beyla - GitHub - grafana/beyla: eBPF-based autoinstrumentation of web applications and network metrics
That will generate traces for each particular request. Save traces into Tempo and you can inspect latency/error rate for each particular pod, service.