I am sending logs to my self-hosted Loki 3.2.x via the OTEL-collector Helm chart using the otel/opentelemetry-collector-contrib image with this config.
config:
exporters:
debug:
verbosity: basic
otlphttp:
endpoint: http://xxx.yyy
tls:
insecure: true
otlphttp/loki:
endpoint: http://loki.xxx.yyy/otlp
tls:
insecure: true
receivers:
jaeger: null
zipkin: null
service:
extensions:
- health_check
pipelines:
logs:
receivers:
- otlp
exporters:
- otlphttp/loki
- debug
processors:
- memory_limiter
- batch
- resource
traces:
receivers:
- otlp
exporters:
- otlphttp
#- debug
processors:
- memory_limiter
- batch
- resource
metrics:
receivers:
- otlp
exporters:
#- debug
processors:
- memory_limiter
- batch
- resource
processors:
resource:
attributes:
- key: k8s.cluster.name
value: ${environment_short}-aks
action: upsert
The data seems to arrive to Loki without any obvious issues, but reading it out from my self-hosted Grafana results in this error for the majority of the data:
{
"results": {
"A": {
"error": "failed to parse series labels to categorize labels: 1:2: parse error: unexpected \"=\" in label set, expected identifier or \"}\"",
"errorSource": "downstream",
"status": 500
}
}
}
For some reason, certain logs work as expected, while the majority of my log entries seem to corrupt the data/labels. Not quite sure what is going on.
Sample data from a successful Grafana query towards data ingested via OTLP:
Common labels: {"ConnectionId":"0HN9MSQJGRG0I","detected_level":"debug","k8s_cluster_name":"d-aks","k8s_container_name":"testteam1-web","k8s_deployment_name":"testteam1-web","k8s_namespace_name":"testteam1","k8s_node_name":"aks-workerpool2-39622677-vmss000000","k8s_pod_name":"testteam1-web-6445867b4-6hf6p","k8s_replicaset_name":"testteam1-web-6445867b4","service_instance_id":"testteam1.testteam1-web-6445867b4-6hf6p.testteam1-web","service_name":"testteam1-web","service_version":"17-01-2025.981","severity_number":"5","severity_text":"Debug","telemetry_sdk_language":"dotnet","telemetry_sdk_name":"opentelemetry","telemetry_sdk_version":"1.11.0"}
Line limit: "1000 (4 displayed)"
Total bytes processed: "1.28 kB"
2025-01-17 12:05:03.720 Connection id "0HN9MSQJGRG0I" stopped.
2025-01-17 12:05:03.718 Connection id "0HN9MSQJGRG0I" disconnecting.
2025-01-17 12:05:03.715 Connection id "0HN9MSQJGRG0I" sending FIN because: "The Socket transport's send loop completed gracefully."
2025-01-17 12:05:03.714 Connection id "0HN9MSQJGRG0I" received FIN.
^ Just to show that ingestion and querying works just fine for some of the log lines
When running a query towards “corrupted” data the query first displays like this:
Then it is stuck loading for a while before it ends up like this:
I have been successfully using Promtail for Loki ingestion for years without similar issues. I am not sure how to go on about debugging or what might be the issue here. The open-telemetry collector does not show any errors or warnings in the debug log output.