Error sending logs with loki.resource.labels

I would like to know how to make certain OpenTelemetry resource attributes show up as labels in Grafana Cloud when sending logs via OTLP and configuring the OpenTelemetry Java Agent using the OTEL_* environment variables. I researched how to set the loki.resource.labels resource label to achieve this, but was unsuccessful in implementing it.

Our Java service starts with the OpenTelemetry Java Agent version 1.26.0 and (among others) following environment variables:

  • OTEL_LOGS_EXPORTER=otlp
  • OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
  • OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp-gateway-prod-eu-west-0.grafana.net/otlp
  • OTEL_EXPORTER_OTLP_HEADERS set to contain the credentials
  • OTEL_RESOURCE_ATTRIBUTES set to service.name=foo-service,service.namespace=staging,service.version=commit-sha,service.instance.id=something-random
    • and a few container.* and k8s.* attributes according to the spec

The logs show up fine in our Grafana Cloud instance and have the default exporter, instance, job and level labels set as expected.

I want to add more labels and follow the documentation in:

I set OTEL_RESOURCE_ATTRIBUTES to service.name=foo-service,[...],loki.resource.labels=service.name%2Cservice.namespace (%2C is comma / , in percent encoding; the middle part is shortened to [...] for brevity in this post) and restart the service, but now Grafana Cloud stops showing any logs. I get these messages in my service’s console logs:

[otel.javaagent 2023-05-13 06:41:17:069 +0000] [OkHttp https://otlp-gateway-prod-eu-west-0.grafana.net/...] WARN io.opentelemetry.exporter.internal.okhttp.OkHttpExporter - Failed to export logs. Server responded with HTTP status code 400. Error message: Unable to parse response body, HTTP status message:

Setting loki.resource.labels=service.name (single attribute, no special encoding) has the same effect.

Changing loki.resource.labels to a non-existent resource attribute, for example loki.resource.labels=service_name, allows for the logs to flow to Grafana Cloud again. No additional labels are set, which is expected, since the resource attribute does not actually exist, but I found it noteworthy that a non-existent label does not trigger the HTTP status code 400 error.

Turning OpenTelemetry Java Agent debug logging on (cf. GitHub - open-telemetry/opentelemetry-java-instrumentation: OpenTelemetry auto-instrumentation and instrumentation libraries for Java) does not reveal further information. In particular it does not show the body of the response that the OkHttp / OTLP HTTP exporter was unable to parse.

Any help on how to achieve my goal or how to further debug this (e.g. how to see the HTTP response body cannot be parsed) are appreciated.

To debug this further, I set up an OpenTelemetry Collector (using opentelemetry-helm-charts/charts/opentelemetry-collector at main · open-telemetry/opentelemetry-helm-charts · GitHub) and enabled debug logs (cf. opentelemetry-collector/troubleshooting.md at main · open-telemetry/opentelemetry-collector · GitHub) with following values.yml:

mode: "deployment"
config:
  extensions:
    zpages:
      endpoint: localhost:55679 # DEBUG
    basicauth/otlp:
      client_auth:
        username: "${instance_id}"
        password: "${publisher_token}"
  processors:
    resource:
      attributes:
        - action: insert
          key: loki.resource.labels
          value: service.name,service.namespace,service.version,deployment.environment
  exporters:
    logging:
      verbosity: detailed # DEBUG
    otlphttp:
      auth:
        authenticator: basicauth/otlp
      endpoint: "${otlp_url}"
  service:
    telemetry:
      logs:
        level: "debug" #DEBUG
    extensions:
      - zpages # DEBUG
      - health_check
      - memory_ballast
      - basicauth/otlp
    pipelines:
      metrics:
        receivers:
          - otlp
        exporters:
          - logging
          - otlphttp
      traces:
        receivers:
          - otlp
        exporters:
          - logging
          - otlphttp
      logs:
        receivers:
          - otlp
        processors:
          - resource
          - memory_limiter
          - batch
        exporters:
          - logging # DEBUG
          - otlphttp

But that did not provide any additional information:

2023-05-14T13:54:07.659Z	debug	otlphttpexporter@v0.77.0/otlp.go:127	Preparing to make HTTP request	{"kind": "exporter", "data_type": "logs", "name": "otlphttp", "url": "https://otlp-gateway-prod-eu-west-0.grafana.net/otlp/v1/logs"}
2023-05-14T13:54:08.026Z	error	exporterhelper/queued_retry.go:402	Exporting failed. The error is not retryable. Dropping data.	{"kind": "exporter", "data_type": "logs", "name": "otlphttp", "error": "Permanent error: error exporting items, request to https://otlp-gateway-prod-eu-west-0.grafana.net/otlp/v1/logs responded with HTTP Status Code 400", "dropped_items": 1}
go.opentelemetry.io/collector/exporter/exporterhelper.(*retrySender).send
	go.opentelemetry.io/collector/exporter@v0.77.0/exporterhelper/queued_retry.go:402
go.opentelemetry.io/collector/exporter/exporterhelper.(*logsExporterWithObservability).send
	go.opentelemetry.io/collector/exporter@v0.77.0/exporterhelper/logs.go:135
go.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).start.func1
	go.opentelemetry.io/collector/exporter@v0.77.0/exporterhelper/queued_retry.go:206
go.opentelemetry.io/collector/exporter/exporterhelper/internal.(*boundedMemoryQueue).StartConsumers.func1
	go.opentelemetry.io/collector/exporter@v0.77.0/exporterhelper/internal/bounded_memory_queue.go:58

However, I found several resources in the context of OpenTelemetry and Loki that pointed to the fact that Loki labels cannot contain dots .:

I ended up with this OpenTelemetry Collector values.yml for the Helm Chart:

fullnameOverride: "collector"
mode: "deployment"
config:
  extensions:
    basicauth/otlp:
      client_auth:
        username: "${instance_id}"
        password: "${publisher_token}"
  processors:
    resource:
      attributes:
        - action: insert
          key: service_name
          from_attribute: service.name
        - action: insert
          key: service_namespace
          from_attribute: service.namespace
        - action: insert
          key: service_version
          from_attribute: service.version
        - action: insert
          key: deployment_environment
          from_attribute: deployment.environment
        - action: insert
          key: loki.resource.labels
          value: service_name,service_namespace,service_version,deployment_environment
  exporters:
    otlphttp:
      auth:
        authenticator: basicauth/otlp
      endpoint: "${otlp_url}"
  service:
    extensions:
      - health_check
      - memory_ballast
      - basicauth/otlp
    pipelines:
      metrics:
        receivers:
          - otlp
        exporters:
          - otlphttp
      traces:
        receivers:
          - otlp
        exporters:
          - otlphttp
      logs:
        receivers:
          - otlp
        processors:
          - resource
          - memory_limiter
          - batch
        exporters:
          - otlphttp

Note how the resource processor creates service_name from service.name and how I then use the former in loki.resource.labels.

The logs now show up in Loki with the correct labels. :slight_smile:

3 Likes