Documentation clarification: why "hosts" labels is considered as a static one?

genestasebastien · March 25, 2024, 9:07pm

Hi,

I’m deploying Loki according to best practices described in the documentation and I’d like to clarify a point that I didn’t quite understand.

The documentation clearly explains bad impacts of high cardinality and why it is important to use static label (like application name, environment) and not using dynamic labels (like pod names).

The part that i’d like to clarify is why host is not considered as a dynamic label especially in cloud environment where hosts are provisionned and deprovisionned dynamically (when using karpenter in AWS as an example).

I don’t understand why it is ok to have a host label with 1000 possible values but having log level label with 5 possible values is a problem.

Even if I well understand that host values changes less than level values, why metadata usage is not a recommendation for host information?

Thanks for the help!

Seb.

tonyswumac · March 26, 2024, 3:40am

As the documentation mentioned, it’s not an exact science. You can absolutely use hostname as label, provided the potential values are “bounded”.

If you have 1000 hosts, it doesn’t really matter if the hostnames are random, you can still only have 1000 potential values at any given time. The thing to consider here is that over time then the potential values grow, but the cardinality doesn’t really become an issue until you start to query for a longer time frame (for example, if you refresh all 1000 hosts every day, querying a day vs querying a week would be 7 times more potential values for the hostname label).

The label documentation is what I’d consider a guideline. It’s supposed to convey the idea that having less potential values for label is good, but it’s a balance that you should consider for yourself and your environment whether a label is worth having or not.

genestasebastien · March 26, 2024, 8:52pm

Thanks for the clarification!