Apologies for posting in the Grafana parent topic, but it wasn’t clear that any of the others were applicable.
I have an application manually instrumented with OpenTelemetry. I’ve been using SigNoz to collect and display traces during development, but we’ve found that SigNoz is just too operationally scary to run in production. I’ve been familiar with Grafana for a few years displaying Prometheus metrics, but I’m trying to determine if the OSS version of Grafana is capable of displaying traces from OTLP. The documentation is ambiguous about it. We’d like to self-host Grafana OSS in production, but it’s not going to be suitable for us if it can’t display traces.
No, because OTLP is only a protocol and not a storage. You should ask if Grafana is capable of displaying traces from your selected trace storage. Then the answer will it depends on selected trace storage. I would say that you will get the best Grafana experience if you choose another Grafana Lab product as your trace storage - Tempo.
Tempo on-prem might be too operationally scary to run in production for you. Then you can choose SaaS solutions, e.g. Tempo in Grafana Cloud, AWS X-Ray, Lightstep, Dynatrace, … Of course Jaeger, Zipkin are also supported.
Feel free to choose your favorite Grafana supported trace storage.
No, because OTLP is only a protocol and not a storage
I’m aware, I worded it a bit poorly. I didn’t know if Grafana had something equivalent to the otel collector in SigNoz.
Thanks for the trace storage suggestions! I’m glad to know that Grafana can at least display traces if I can get them into it somehow.
I would highly recommend to make POC for selected trace storage first (for your the most complex use case). OTLP is nice standard for trace ingestion. Unfortunately, there is no standard for querying. Each trace storage offers different querying features (e.g. metric generation, filtering by attribute, …), which may not be right fit for your needs.
We’re almost certainly going with Grafana Tempo. I’m currently in the process of setting S3 storage to stress test it.
If you try running mimir, tempo, loki and grafana UI along with agent, you will find it more more complex to run than SigNoz in any production workload