Thanks @tonyswumac for the suggestion!
You were absolutely right - checking the actual configuration mounted in the pod revealed the issue. Here’s what I discovered:
Problem Diagnosis
Current Observed Behavior
Case 1: With stage.match
blocks present
- Helm chart deployment fails with this error:
key " \"duration\" ]\n }\n\n // For logs where the label is empty" has no value (cannot end with ,)
Case 2: Without stage.match
blocks
- Chart deploys successfully
- BUT pods fail to start with multiple syntax errors:
Error: /etc/alloy/config.alloy:369:21: missing ',' in field list
Error: /etc/alloy/config.alloy:371:8: expected =, got .
Error: /etc/alloy/config.alloy:371:9: cannot use a block as an expression
[... more syntax errors ...]
Generated Configmap Truncation
The generated ConfigMap shows severe truncation of the Alloy configuration. The file abruptly cuts off mid-block:
# What should be a complete stage.json block:
stage.json {
expressions = {
app = ""
app_version = ""
cloud_provider = ""
duration = ""
# ... more fields should be here
}
}
# Instead, it gets truncated to:
stage.json {
expressions = {
app = ""
# Destination: logs-service (loki) # ← Jumps directly to destination config!
otelcol.exporter.loki "logs_service" {
forward_to = [loki.write.logs_service.receiver]
}
Technical Details
Complete "working" (ie generating a not working either configmap) configuration (without stage.match blocks)
// Extract container name from __meta_docker_container_name label and add as label
discovery.relabel "task_analysis" {
targets = discovery.kubernetes.pods.targets
rule {
source_labels = ["__meta_kubernetes_pod_name"]
regex = "engine-task-controller-.*"
action = "keep"
}
// Ensure the "task_analysis" label is added if it doesn't exist
rule {
action = "replace"
target_label = "action"
replacement = "task_analysis"
}
// Ensure the "cluster" label is added if it doesn't exist
rule {
action = "replace"
target_label = "cluster"
replacement = env("CLUSTER_NAME")
}
}
loki.source.kubernetes "engine_task_controller_analysis" {
targets = discovery.relabel.task_analysis.output
forward_to = [loki.process.task_analysis_json_extraction.receiver]
}
loki.process "task_analysis_json_extraction" {
// Parse the JSON first
stage.json {
expressions = {
app = "",
app_version = "",
cloud_provider = "",
duration = "",
graphical = "",
namespace = "",
reconcile_id = "",
state = "",
task_id = "",
region = "",
ressource_type = "",
}
}
stage.labels {
values = {
has_duration = "duration",
namespace = "namespace",
app_version = "app_version",
graphical = "graphical",
state = "state",
circle_app = "app",
cloud_provider = "cloud_provider",
ressource_type = "ressource_type",
}
}
stage.structured_metadata {
values = {
reconcile_id = "reconcile_id",
task_id = "task_id",
duration = "duration",
region = "region",
}
}
stage.label_drop {
values = [ "has_duration" , "duration" ]
}
stage.timestamp {
source = "ts"
format = "RFC3339"
}
forward_to = [loki.write.logs_service.receiver]
}
Full truncated ConfigMap output
apiVersion: v1
data:
config.alloy: |-
// Feature: Node Logs
declare "node_logs" {
argument "logs_destinations" {
comment = "Must be a list of log destinations where collected logs should be forwarded to"
}
loki.relabel "journal" {
// copy all journal labels and make the available to the pipeline stages as labels, there is a label
// keep defined to filter out unwanted labels, these pipeline labels can be set as structured metadata
// as well, the following labels are available:
// - boot_id
// - cap_effective
// - cmdline
// - comm
// - exe
// - gid
// - hostname
// - machine_id
// - pid
// - stream_id
// - systemd_cgroup
// - systemd_invocation_id
// - systemd_slice
// - systemd_unit
// - transport
// - uid
//
// More Info: https://www.freedesktop.org/software/systemd/man/systemd.journal-fields.html
rule {
action = "labelmap"
regex = "__journal__(.+)"
}
rule {
action = "replace"
source_labels = ["__journal__systemd_unit"]
replacement = "$1"
target_label = "unit"
}
// the service_name label will be set automatically in loki if not set, and the unit label
// will not allow service_name to be set automatically.
rule {
action = "replace"
source_labels = ["__journal__systemd_unit"]
replacement = "$1"
target_label = "service_name"
}
forward_to = [] // No forward_to is used in this component, the defined rules are used in the loki.source.journal component
}
loki.source.journal "worker" {
path = "/var/log/journal"
format_as_json = false
max_age = "8h"
relabel_rules = loki.relabel.journal.rules
labels = {
job = "integrations/kubernetes/journal",
instance = sys.env("HOSTNAME"),
}
forward_to = [loki.process.journal_logs.receiver]
}
loki.process "journal_logs" {
stage.static_labels {
values = {
// add a static source label to the logs so they can be differentiated / restricted if necessary
"source" = "journal",
// default level to unknown
level = "unknown",
}
}
// Attempt to determine the log level, most k8s workers are either in logfmt or klog formats
// check to see if the log line matches the klog format (https://github.com/kubernetes/klog)
stage.match {
// unescaped regex: ([IWED][0-9]{4}\s+[0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]+)
selector = "{level=\"unknown\"} |~ \"([IWED][0-9]{4}\\\\s+[0-9]{2}:[0-9]{2}:[0-9]{2}\\\\.[0-9]+)\""
// extract log level, klog uses a single letter code for the level followed by the month and day i.e. I0119
stage.regex {
expression = "((?P<level>[A-Z])[0-9])"
}
// if the extracted level is I set INFO
stage.replace {
source = "level"
expression = "(I)"
replace = "INFO"
}
// if the extracted level is W set WARN
stage.replace {
source = "level"
expression = "(W)"
replace = "WARN"
}
// if the extracted level is E set ERROR
stage.replace {
source = "level"
expression = "(E)"
replace = "ERROR"
}
// if the extracted level is I set INFO
stage.replace {
source = "level"
expression = "(D)"
replace = "DEBUG"
}
// set the extracted level to be a label
stage.labels {
values = {
level = "",
}
}
}
// if the level is still unknown, do one last attempt at detecting it based on common levels
stage.match {
selector = "{level=\"unknown\"}"
// unescaped regex: (?i)(?:"(?:level|loglevel|levelname|lvl|levelText|SeverityText)":\s*"|\s*(?:level|loglevel|levelText|lvl)="?|\s+\[?)(?P<level>(DEBUG?|DBG|INFO?(RMATION)?|WA?RN(ING)?|ERR(OR)?|CRI?T(ICAL)?|FATAL|FTL|NOTICE|TRACE|TRC|PANIC|PNC|ALERT|EMERGENCY))("|\s+|-|\s*\])
stage.regex {
expression = "(?i)(?:\"(?:level|loglevel|levelname|lvl|levelText|SeverityText)\":\\s*\"|\\s*(?:level|loglevel|levelText|lvl)=\"?|\\s+\\[?)(?P<level>(DEBUG?|DBG|INFO?(RMATION)?|WA?RN(ING)?|ERR(OR)?|CRI?T(ICAL)?|FATAL|FTL|NOTICE|TRACE|TRC|PANIC|PNC|ALERT|EMERGENCY))(\"|\\s+|-|\\s*\\])"
}
// set the extracted level to be a label
stage.labels {
values = {
level = "",
}
}
}
// Only keep the labels that are defined in the `keepLabels` list.
stage.label_keep {
values = ["instance","job","level","name","unit","service_name","source"]
}
forward_to = argument.logs_destinations.value
}
}
node_logs "feature" {
logs_destinations = [
loki.write.logs_service.receiver,
]
}
// Feature: Pod Logs
declare "pod_logs" {
argument "logs_destinations" {
comment = "Must be a list of log destinations where collected logs should be forwarded to"
}
discovery.relabel "filtered_pods" {
targets = discovery.kubernetes.pods.targets
rule {
source_labels = ["__meta_kubernetes_namespace"]
action = "replace"
target_label = "namespace"
}
rule {
source_labels = ["__meta_kubernetes_pod_name"]
action = "replace"
target_label = "pod"
}
rule {
source_labels = ["__meta_kubernetes_pod_container_name"]
action = "replace"
target_label = "container"
}
rule {
source_labels = ["__meta_kubernetes_namespace", "__meta_kubernetes_pod_container_name"]
separator = "/"
action = "replace"
replacement = "$1"
target_label = "job"
}
// set the container runtime as a label
rule {
action = "replace"
source_labels = ["__meta_kubernetes_pod_container_id"]
regex = "^(\\S+):\\/\\/.+$"
replacement = "$1"
target_label = "tmp_container_runtime"
}
// make all labels on the pod available to the pipeline as labels,
// they are omitted before write to loki via stage.label_keep unless explicitly set
rule {
action = "labelmap"
regex = "__meta_kubernetes_pod_label_(.+)"
}
// make all annotations on the pod available to the pipeline as labels,
// they are omitted before write to loki via stage.label_keep unless explicitly set
rule {
action = "labelmap"
regex = "__meta_kubernetes_pod_annotation_(.+)"
}
// explicitly set service_name. if not set, loki will automatically try to populate a default.
// see https://grafana.com/docs/loki/latest/get-started/labels/#default-labels-for-all-users
//
// choose the first value found from the following ordered list:
// - pod.annotation[resource.opentelemetry.io/service.name]
// - pod.label[app.kubernetes.io/name]
// - k8s.pod.name
// - k8s.container.name
rule {
action = "replace"
source_labels = [
"__meta_kubernetes_pod_annotation_resource_opentelemetry_io_service_name",
"__meta_kubernetes_pod_label_app_kubernetes_io_name",
"__meta_kubernetes_pod_container_name",
]
separator = ";"
regex = "^(?:;*)?([^;]+).*$"
replacement = "$1"
target_label = "service_name"
}
// set resource attributes
rule {
action = "labelmap"
regex = "__meta_kubernetes_pod_annotation_resource_opentelemetry_io_(.+)"
}
rule {
source_labels = ["__meta_kubernetes_pod_annotation_k8s_grafana_com_logs_job"]
regex = "(.+)"
target_label = "job"
}
rule {
source_labels = ["__meta_kubernetes_pod_label_app_kubernetes_io_name"]
regex = "(.+)"
target_label = "app_kubernetes_io_name"
}
}
discovery.kubernetes "pods" {
role = "pod"
selectors {
role = "pod"
field = "spec.nodeName=" + sys.env("HOSTNAME")
}
}
discovery.relabel "filtered_pods_with_paths" {
targets = discovery.relabel.filtered_pods.output
rule {
source_labels = ["__meta_kubernetes_pod_uid", "__meta_kubernetes_pod_container_name"]
separator = "/"
action = "replace"
replacement = "/var/log/pods/*$1/*.log"
target_label = "__path__"
}
}
local.file_match "pod_logs" {
path_targets = discovery.relabel.filtered_pods_with_paths.output
}
loki.source.file "pod_logs" {
targets = local.file_match.pod_logs.targets
forward_to = [loki.process.pod_logs.receiver]
}
loki.process "pod_logs" {
stage.match {
selector = "{tmp_container_runtime=~\"containerd|cri-o\"}"
// the cri processing stage extracts the following k/v pairs: log, stream, time, flags
stage.cri {}
// Set the extract flags and stream values as labels
stage.labels {
values = {
flags = "",
stream = "",
}
}
}
stage.match {
selector = "{tmp_container_runtime=\"docker\"}"
// the docker processing stage extracts the following k/v pairs: log, stream, time
stage.docker {}
// Set the extract stream value as a label
stage.labels {
values = {
stream = "",
}
}
}
// Drop the filename label, since it's not really useful in the context of Kubernetes, where we already have cluster,
// namespace, pod, and container labels. Drop any structured metadata. Also drop the temporary
// container runtime label as it is no longer needed.
stage.label_drop {
values = [
"filename",
"tmp_container_runtime",
]
}
stage.structured_metadata {
values = {
"k8s_pod_name" = "k8s_pod_name",
"pod" = "pod",
}
}
// Only keep the labels that are defined in the `keepLabels` list.
stage.label_keep {
values = ["app_kubernetes_io_name","container","instance","job","level","namespace","service_name","service_namespace","deployment_environment","deployment_environment_name","k8s_namespace_name","k8s_deployment_name","k8s_statefulset_name","k8s_daemonset_name","k8s_cronjob_name","k8s_job_name","k8s_node_name"]
}
forward_to = argument.logs_destinations.value
}
}
pod_logs "feature" {
logs_destinations = [
loki.write.logs_service.receiver,
]
}
// Extract container name from __meta_docker_container_name label and add as label
discovery.relabel "task_analysis" {
targets = discovery.kubernetes.pods.targets
rule {
source_labels = ["__meta_kubernetes_pod_name"]
regex = "engine-task-controller-.*"
action = "keep"
}
// Ensure the "task_analysis" label is added if it doesn't exist
rule {
action = "replace"
target_label = "action"
replacement = "task_analysis"
}
// Ensure the "cluster" label is added if it doesn't exist
rule {
action = "replace"
target_label = "cluster"
replacement = env("CLUSTER_NAME")
}
}
loki.source.kubernetes "engine_task_controller_analysis" {
targets = discovery.relabel.task_analysis.output
forward_to = [loki.process.task_analysis_json_extraction.receiver]
}
loki.process "task_analysis_json_extraction" {
// Parse the JSON first
stage.json {
expressions = {
app = ""
// Destination: logs-service (loki)
otelcol.exporter.loki "logs_service" {
forward_to = [loki.write.logs_service.receiver]
}
loki.write "logs_service" {
endpoint {
url = "https://loki-prod.orchestrator.circledental.cloud/loki/api/v1/push"
basic_auth {
username = convert.nonsensitive(remote.kubernetes.secret.logs_service.data["username"])
password = remote.kubernetes.secret.logs_service.data["password"]
}
tls_config {
insecure_skip_verify = false
}
min_backoff_period = "500ms"
max_backoff_period = "5m"
max_backoff_retries = "10"
}
external_labels = {
"cluster" = "XXX",
"k8s_cluster_name" = "XXX",
}
}
remote.kubernetes.secret "logs_service" {
name = "logs-service-grafana-k8s-monitoring"
namespace = "default"
}
kind: ConfigMap
metadata:
annotations:
meta.helm.sh/release-name: grafana-k8s-monitoring
meta.helm.sh/release-namespace: default
creationTimestamp: "2025-06-02T14:40:27Z"
labels:
app.kubernetes.io/managed-by: Helm
name: grafana-k8s-monitoring-alloy-logs
namespace: default
resourceVersion: "19925"
uid: 03b128f9-e1af-4a30-9b71-e9e3baf53cee
Analysis & Questions
The configuration is being corrupted during the template processing pipeline:
- OpenTofu’s
file()
function → Helm’s --set
mechanism → ConfigMap generation
Possible causes:
- String escaping issues in the Helm template processing
- Size limitations in Helm’s
--set
parameter handling
- Special character conflicts (quotes, newlines, etc.) between OpenTofu and Helm
- YAML parsing issues when complex multi-line strings are processed
Request for Help
Has anyone encountered similar truncation issues when passing complex Alloy configurations through Helm charts?
Specific questions:
- Is there a better way to inject large configuration blocks into Helm charts?
- How to get this configuration working without
stage.match
first, and then with stage.match
to get my workflow working
- Should I use ConfigMap files instead of
--set
parameters?
- Are there known limitations with OpenTofu’s
file()
function and Helm integration?
Any insights on the proper way to handle this template → Helm → ConfigMap workflow would be greatly appreciated!