Setup otlp ingestion from otel collector to loki (v.3.3.1) but loki sent response like:
2025-01-23T06:55:10.535Z error internal/queue_sender.go:92 Exporting failed. Dropping data. {"kind": "exporter", "data_type": "logs", "name": "otlphttp", "error": "not retryable error: Permanent error: rpc error: code = InvalidArgument desc = error exporting items, request to http://loki-v3-distributor.opentelemetry.svc.cluster.local:3100/otlp/v1/logs responded with HTTP Status Code 400", "dropped_items": 180}
otel-collector configuration:
exporters:
otlp:
endpoint: ${tempo_endpoint}
tls:
insecure: true
otlphttp:
endpoint: "http://${loki_endpoint}/otlp"
tls:
insecure: true
...
service:
pipelines:
logs:
receivers:
- otlp
- tcplog
processors:
- memory_limiter
- batch
- transform
- filter
- attributes
- resource
exporters:
- otlphttp
- debug
To loki config added:
limits_config:
allow_structured_metadata: true
Generally, it looks good. Check Loki logs, check logs from debug exporter for anomalies (e.g. too many attributes, too long body, …).
Check collector metrics OpenTelemetry Collector | Grafana Labs
Maybe some requests are successful and some not (that means there is some log, which is able to poisom whole batch).
I found some logs in distributor with this:
{"bodySize":"1.3 MB","caller":"push.go:162","contentEncoding":"gzip","contentType":"application/x-protobuf","entries":8102,"entriesSize":"13 MB","level":"debug","mostRecentLagMs":1964,"msg":"push request parsed","org_id":"fake","path":"/otlp/v1/logs","streamLabelsSize":"0 B","streams":1,"structuredMetadataSize":"12 MB","totalSize":"13 MB","ts":"2025-01-24T10:22:24.655547912Z"}
{"caller":"manager.go:50","component":"distributor","details":"error at least one label pair is required per stream","level":"error","msg":"write operation failed","org_id":"fake","path":"write","ts":"2025-01-24T10:22:24.655640125Z"}
{"caller":"logging.go:124","level":"debug","msg":"POST /otlp/v1/logs (400) 167.356149ms","orgID":"fake","ts":"2025-01-24T10:22:24.659037822Z"}
Do you have some ideas how it fix?
Yes, define resorce attribute, e.g.
service.name
for each log - for example use resource processor - each log will have that attribute/label, so Loki’s requirement will be satisfied.