Best-practices for enrichment of timeseries data with meta information

Hi,

I am not sure if I have fully understood the way how timeseries DBs like mimir should be used, or what kinds of behaviour I can expect from them. So, I am going to formulate a small scenario to demonstrate my dilemma.
Let’s say I have a sensor that measures the temperature and writes the measured value into a timeseries db like
temperature(id=“sensorId”):tempValue
When I open the Grafana Explore mode I can find the sensor by selecting the label id and by providing its identifier as label value. However, looking up sensor temperature values by identifier does not provide high user experience as sensor identifiers are not meant for humans.
Normally, a sensor is embedded into a business context and therefore in relation with further information like sensor name (not unique), sensor type, sensor vendor, business contract, location etc.
Users are more likely to use such variables when they are looking for the temperature of a sensor.
This however, as I see it, would require me to encode all these searchable attributes in the timeseries
temperature(id=“sensorId” name=“sensorName” location=“city” type=“S9600” vendor=“sensorVendor”):value
The problem with this approach is that if one of the attributes change,a new attribute is added, or an existing one becomes obsolete, a new timeseries is created and there is a “break” in the diagram.
What would be the best-practice to provide meta-information of a sensor to the user in the explorer mode while keeping the structure of the timeseries as simple as possible?

Thank you!

I would say good practice (only you know what is best-practice for you) is info metric.

So you can have “real” metrics:

temperature(id=“sensorId”)
pressure(id=“sensorId”)

and then “info” metric:

info(id=“sensorId” name=“sensorName” location=“city” type=“S9600” vendor=“sensorVendor”):

You can join these real and info metrics on the promql level:

Cool, thx for the answer!

Some additional questions:

  • How do you decide which labels will come into the “info” metrics or the “real”? Is it how often they change or how frequently they are used, for example?
  • Is there any performance impact if we do these joins on “info” metrics?
  • Is it possible to predefine these joins in the datasource configuration, so that users can explore the metrics more easily without needing to be aware of the associated “info” metrics?
  • I would say avoid label duplication, except that id label
  • Of course, there is is slower, because there ia additional computation. But you will ve superhuman if you will be able to notice a few ms difference in the browser
  • No, that doesn’t make sense unless you are able to enforce certain labels. In real wild world anyone is using different labels, convention so this predefined join will may be causing more problems
  • I would say avoid label duplication, except that id label

But how do you decide if the label should go into the “real” or “info” metric
Or do you always create by default an “info” metric and put there all labels that could change over time?

temperature{id="sensorId", name="name"}
pressure{id="sensorId", name="name"}

That’s duplication of the name, so it should be:

temperature{id="sensorId"}
pressure{id="sensorId"}
info{id="sensorId", name="name"}

Use OTEL properly (name will be resource attribute) and metric storage, which follow OTEL spec and info metrics will be created automatically, e.g. Grafana Cloud:

Now I understand it! Thx a lot!