Loki labels parsing with Grafana Cloud and native OTel endpoint

Hi all,

I’m trying to send logs to Loki using the native OTel endpoint on Grafana Cloud through OTelcol-contrib.

The Grafana Cloud documentation states that some well-known Resource fields should be automatically converted into labels. In my case I started with service.name. Despite setting it correctly (see below), all log entries get a label service_name set to unknown_service and a metadata entry service_name_extracted with the value that I set in the collector. Additional Resource fields (e.g. deployment.environment) that are supposed to be converted into labels are also ignored (or rather: converted into metadata entries).

In this section I’m getting the logs from suricata (but it could be any app, obviously) directly from the instance, parsing the lines and adding some of the well-documented fields

receivers:
  filelog/suricata:
    include: [ /var/log/suricata/fast.log ]
    operators:
      - type: regex_parser
        regex: '^(?P<timestamp>\d{2}/\d{2}/\d{4}-\d{2}:\d{2}:\d{2}\.\d{6})  \[\*\*\] (?P<message>.*) \[\*\*\] \[Classification: (?P<classification>.*)\] \[Priority: (?P<priority>\d*)\] {(?<network_transport>[[:upper:]]*)} (?<source_address>.*):(?<source_port>.*) -> (?<destination_address>.*):(?<destination_port>.*)$'
        timestamp:
          parse_from: attributes.timestamp
          layout_type: strptime
          layout: '%m/%d/%Y-%T.%f'
        severity:
          parse_from: attributes.priority
          overwrite_text: true
          preset: none
          mapping:
            error: 1
            warn: 2
            info: 3
            debug:
              - min: 4
                max: 10
      - type: add
        field: resource.service.name
        value: suricata
      - type: add
        field: resource.service.namespace
        value: none
      - type: add
        field: resource.deployment.environment
        value: test

This generates the expected data structure (from the debug exporter in OTel Collector)

Nov 04 16:24:28 HostName otelcol-contrib[40612]: Resource SchemaURL: https://opentelemetry.io/schemas/1.6.1
Nov 04 16:24:28 HostName otelcol-contrib[40612]: Resource attributes:
Nov 04 16:24:28 HostName otelcol-contrib[40612]:      -> service: Map({"name":"suricata","namespace":"none"})
Nov 04 16:24:28 HostName otelcol-contrib[40612]:      -> deployment: Map({"environment":"test"})
Nov 04 16:24:28 HostName otelcol-contrib[40612]:      -> host.name: Str(HostName)
Nov 04 16:24:28 HostName otelcol-contrib[40612]:      -> os.type: Str(linux)
Nov 04 16:24:28 HostName otelcol-contrib[40612]:      -> host.id: Str(2c57138c02ea4cdca12de4c3a5974802)
Nov 04 16:24:28 HostName otelcol-contrib[40612]:      -> host.arch: Str(arm64)
Nov 04 16:24:28 HostName otelcol-contrib[40612]:      -> host.ip: Slice(["00.00.00.00","0000:0000:0000:0000::0000","0000::0000:0000:0000:0000"])
Nov 04 16:24:28 HostName otelcol-contrib[40612]:      -> os.description: Str(AlmaLinux 9.4 (Seafoam Ocelot) (Linux HostName 6.6.31-20240529.v8.2.el9 #1 SMP PREEMPT Mon Jun 24 08:28:31 EDT 2024 aarch64))
Nov 04 16:24:28 HostName otelcol-contrib[40612]:      -> host.cpu.model.name: Str(Cortex-A72)
Nov 04 16:24:28 HostName otelcol-contrib[40612]: ScopeLogs #0
Nov 04 16:24:28 HostName otelcol-contrib[40612]: ScopeLogs SchemaURL:
Nov 04 16:24:28 HostName otelcol-contrib[40612]: InstrumentationScope
Nov 04 16:24:28 HostName otelcol-contrib[40612]: LogRecord #0
Nov 04 16:24:28 HostName otelcol-contrib[40612]: ObservedTimestamp: 2024-11-04 16:24:05.720851364 +0000 UTC
Nov 04 16:24:28 HostName otelcol-contrib[40612]: Timestamp: 2024-11-04 16:24:05.565465 +0000 UTC
Nov 04 16:24:28 HostName otelcol-contrib[40612]: SeverityText: WARN
Nov 04 16:24:28 HostName otelcol-contrib[40612]: SeverityNumber: Warn(13)
Nov 04 16:24:28 HostName otelcol-contrib[40612]: Body: Str([1:2100498:7] GPL ATTACK_RESPONSE id check returned root)
Nov 04 16:24:28 HostName otelcol-contrib[40612]: Attributes:
Nov 04 16:24:28 HostName otelcol-contrib[40612]:      -> classification: Str(Potentially Bad Traffic)
Nov 04 16:24:28 HostName otelcol-contrib[40612]:      -> source: Map({"address":"108.157.4.26","port":"80"})
Nov 04 16:24:28 HostName otelcol-contrib[40612]:      -> destination: Map({"address":"00.00.00.00","port":"00000"})
Nov 04 16:24:28 HostName otelcol-contrib[40612]:      -> log.file.name: Str(fast.log)
Nov 04 16:24:28 HostName otelcol-contrib[40612]:      -> network: Map({"transport":"TCP"})
Nov 04 16:24:28 HostName otelcol-contrib[40612]: Trace ID:
Nov 04 16:24:28 HostName otelcol-contrib[40612]: Span ID:
Nov 04 16:24:28 HostName otelcol-contrib[40612]: Flags: 0
Nov 04 16:24:28 HostName otelcol-contrib[40612]:         {"kind": "exporter", "data_type": "logs", "name": "debug"}

but still the only label available in Grafana when exploring the Loki data is unknown_service

despite all the data being actually available in the log as metadata

I also tried using Attributes instead of Resource, just in case, but it did not help.

Has anybody encountered the same issue? Am I missing some crucial setting somewhere?

Many thanks in advance to anybody with ideas on how to solve it :slight_smile:

1 Like

I have the same problem.

1 Like

It’s not attribute with key service.name, but attribute service where is map - so it is more complex datatype.

. is delimiter which allow you to select attributes or body on the entry in this case.

Try to change add operator:

      - type: add
        field: resource["service.name"]
        value: suricata

It works for me. I use tcplog receiver with add operator and I also didn’t read doc :smiley:

1 Like

@jangaraj you are right, using a resource called "service.name" does it (and it also works for a few other fields like "service.namespace" and "deployment.environment").

That being said, the OTel specifications describe service as its own type (rather than a bunch of "service.something" strings) so I would consider this either a bug in the Grafana Cloud OTLP receiver endpoint or an intentional workaround for a bug somewhere else (say e.g. the k8s processor has a bug causing the service information to be added as a bunch of independent strings rather than a map, and the endpoint was designed to work around it).

I’ll post here again if I find something out or end up opening a bug report somewhere :slight_smile:

1 Like

IMHO Grafana cloud implementation is correct. Doc for operators:

If a field contains a dot in it, a field can alternatively use bracket syntax for traversing through a map. For example, to select the key k8s.cluster.name on the entry’s body, you can use the field body["k8s.cluster.name"].