So, when to use structured metadata and when to use labels?

What is the difference between structured metadata and label? If level label is intuitive and that cardinality is low then label or structured metadata? I confused, about this:

From early on, we have set a label dynamically using Promtail pipelines for level. This seemed intuitive for us as we often wanted to only show logs for level=“error”; however, we are re-evaluating this now as writing a query. {app=“loki”} |= “level=error” is proving to be just as fast for many of our applications as {app=“loki”,level=“error”}.

This may seem surprising, but if applications have medium to low volume, that label causes one application’s logs to be split into up to five streams, which means 5x chunks being stored. And loading chunks has an overhead associated with it. Imagine now if that query were {app=“loki”,level!=“debug”}. That would have to load way more chunks than {app=“loki”} != “level=debug”.

Can anyone write an example? I really confused about documentation. Can all labels set to structured metadata is good?

1 Like

I don’t think there is any concrete rule, but everything you’ve said made sense, so I’d say use your judgement.

Personally, I trye look to use labels for information that can be used to separate log streams . For example:

aws_account_id
region
instance_id
app
service

And then use structured metadata when it makes sense for information such as geoip data.

2 Likes

I’ve also just spend about an hour reading various docs on this, yet am also still feeling like I have an incomplete picture of the situation. It sounds as if using structured metadata is preferred over labels, which is fine, though the article about it spends all its time on high-cardinality issues and config/query setup, without explaining how it actually works. Mind you, level is NOT a high-cardinality field in any typical logging library.

I’m happy to structure my Alloy config to use it instead of labels, however I now end up with detected_level presented in Grafana instead of level, as I had before.

Again, I’m happy to adapt to the very real constraints of performant scaling logs, but I think this topic could be much better documented, and the end-user field name should have some configurability–perhaps as Grafana-side concern?

If you are looking for best practices on labels, read this: Label best practices | Grafana Loki documentation