Hi,
I really like the ability to combine metrics and especially traces + logs in Grafana, so I’m reading up on things and trying out stuff. There is one thing I’m not quite clear about (maybe because I haven’t really used it yet):
Loki is the solution for storing logs, okay. But Jaeger also accepts logs, or rather OpenTracing / OpenTelemetry. This is a source of confusion. I get the gist, so this is not a technical, but rather an organisational question.
Span “logs” (OpenTelemetry seems to call them mostly “events” now, which is a bit clearer) are strictly structured (key/value). So no full text per se, you could put it under a “message” key, though.
Logs in the context of Loki are also a bit structured (labels), but there are explicit warnings not to overdo it (meaning: use few dynamic labels, if any, and low cardinality). Loki also stores the full text of the log message.
Now let’s say I want to make an app observable (metrics, traces, logs). Metrics are readily reasoned about (best to share some labels with Loki logs). Traces and Spans can be easily added, too. What about logging statements?
What kind of log message should be part of a span and what should go into Loki? Or both? Do we even need to put logs into spans if we can view spans from Jaeger and logs from Loki side-by-side in Grafana?
The shift in wording from span “logs” to span “events” might be a hint to not put full log messages into spans, but I’d like to hear some thoughts / best practices / current implementations.
Thanks!
Links (excuse the plaintext, I can only put two in as a new user):
Any luck on your end? I am trying it out on an experimental cluster myself. From what I can tell I think I need to create another service for capturing OpenTelemetry logs even though jaeger does absorb it.
There is also an OpenTelemetry Collector that would also show promise, I am wondering if I have that do we even need Jaeger?
But in case I want Jaeger to do the tracing and still loki to handle the logs I was wondering if
--reporter.grpc.host-port
could point to the colector or even Loki (I doubt highly on the Loki one)
I think the main problem here is understanding of concept of different signals: traces vs logs. They are different.
Yes traces have also field logs
, but I would say better name will be messages for better understanding.
Standard trace storages (e. g. Tempo) don’t have feature to work with this logs. Let’s say you write your server access log into trace span logs. There is no easy way how can you extract data from there and create graphs.
But if you write logs into log storage (e. g. Loki) then you can: parse, extract, calculate, create metrics from those logs.
It’s more that OpenTelemetry has this notion of sending logs
as part of their automatic instrumentation. So rather than having something like promtail
which will capture logs by string which requires parsing I am hoping but I am not sure if OpenTelemetry would be sending the logs in a structured format.
So far in the opentelemetry-collector I have a configuration like
service:
pipelines:
logs:
receivers: [otlp]
processors: [batch]
exporters: [loki]
traces:
receivers: [otlp]
processors: [filter/grafana_noise,filter/drop_actuator,batch]
exporters: [otlp,spanmetrics]
metrics:
receivers: [otlp,spanmetrics]
processors: [batch]
exporters: [prometheusremotewrite]
But I don’t think the logs are working correctly.
One kind of visualization I would like is to see the logs and presuming the trace ID is associated to the log entry, I can simply click on the log entry and it will bring up the trace.
Conversely I want to say given a trace ID I will get all the logs that have been recorded for that trace along with the trace visualization
I would say that implementation of log signal in OpenTelemetry is early alpha. I wouldn’t use OTLP for logs for now.
But OTEL collector gives you opportunity to process also non OTLP logs, e.g. promtail can send to Loki receiver.
I have demo, where I process all signals (metrics, logs, traces) from Grafana:
Grafana logs are generated in syslog format, so syslog (actually tcp receiver, because Grafana doesn’t follow any syslog standard) receiver process them and send to Loki via Loki exporter. There is configured also correlation, so user can switch between traces/logs in the Grafana UI (Loki/Tempo).
Just tried Send logs to Loki with Loki receiver | OpenTelemetry documentation for my experiments. I am using Docker to provide the logs and unfortunately no luck in receiving the logs.
Sending from promtail to loki works but sending from the otel-collector does not.
In your project you’re not accepting loki
as a receiver in your otel collector. I also see you linked the trace from the raw log message, I guess there’s no facility to take it from structured_metadata yet
Corect, I don’t see a reason why I should introduce another component in the stack (e. g. promtail and then loki receiver), when I can emit log in different format.
Nothing stopping you to enable receiver which you need. This is just demo for one particular case (Grafana app), which doesn’t fit all use cases of course.
1 Like
I was actually wondering about that in the documentation, I mean why not just send directly to the collector then to Loki from the collector
You need to consider used protocol/log format. You can connect only components, which understand each other (they have implemented support for used protocol/log format). Grafana is not able to generate logs in Loki format, so your can’t send Grafana logs in Loki format to Loki OTEL receiver. But you can but in the middle promtail. Whole tooling is a Lego. You can stick some brick in the middle, so you will connect some pieces together. But you can end up with complex setup with many components, so it is good idea to keep minimal possible components in the design.
1 Like
Yup. I’m starting to see where promtail fits in, primarily because I don’t ingest logs from syslog but from Docker. Theoretically, I could have a unifying entrypoint via OpenTelemetry collector, but the experiment failed in that promtail even though it says it appears to be injesting the data, otel-collector does not pass it down to Loki
In theory you don’t need any new component. Just use Loki log Docker driver and send logs from Docker daemon directly to Loki. Of course you need to be aware of consequences: for example how docker daemon behave when Loki will be down, …
Also chicken and egg problem. Since Loki is running in the same swarm
1 Like
That’s IMHO not a bug, but config issue.