I really like the ability to combine metrics and especially traces + logs in Grafana, so I’m reading up on things and trying out stuff. There is one thing I’m not quite clear about (maybe because I haven’t really used it yet):
Loki is the solution for storing logs, okay. But Jaeger also accepts logs, or rather OpenTracing / OpenTelemetry. This is a source of confusion. I get the gist, so this is not a technical, but rather an organisational question.
Span “logs” (OpenTelemetry seems to call them mostly “events” now, which is a bit clearer) are strictly structured (key/value). So no full text per se, you could put it under a “message” key, though.
Logs in the context of Loki are also a bit structured (labels), but there are explicit warnings not to overdo it (meaning: use few dynamic labels, if any, and low cardinality). Loki also stores the full text of the log message.
Now let’s say I want to make an app observable (metrics, traces, logs). Metrics are readily reasoned about (best to share some labels with Loki logs). Traces and Spans can be easily added, too. What about logging statements?
What kind of log message should be part of a span and what should go into Loki? Or both? Do we even need to put logs into spans if we can view spans from Jaeger and logs from Loki side-by-side in Grafana?
The shift in wording from span “logs” to span “events” might be a hint to not put full log messages into spans, but I’d like to hear some thoughts / best practices / current implementations.
Links (excuse the plaintext, I can only put two in as a new user):
- OpenTracing: https://opentracing.io/docs/overview/spans/#logs
- OpenTelemetry: https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/overview.md#span
- OpenTracing semantic conventions:
- Loki labels:
- Loki best practices: