Hi there,
I’m having trouble extracting the log level from messages using a Loki processing pipeline. I based my config on the solution from Extract log level via regex and set it as a label, but it looks like the regex stage doesn’t match anything
The logs come from JupyterHub single-user pods running in an on-premises Kubernetes cluster. Below is a simplified version of my configuration. I removed unrelated parts for clarity, but I can share the full config if needed
livedebugging {
enabled = true
}
discovery.kubernetes "jupyterhub" {
...
}
discovery.relabel "jupyterhub_logs" {
targets = discovery.kubernetes.jupyterhub.targets
...
}
loki.source.kubernetes "jupyterhub_logs" {
targets = discovery.relabel.jupyterhub_logs.output
forward_to = [loki.process.jupyterhub_logs.receiver]
}
loki.process "jupyterhub_logs" {
stage.regex {
expression = `\[(?P<level>[DIWE])`
}
stage.template {
source = "level_cleansed"
template = `{{ default "I" .level }}`
}
stage.labels {
values = { "level" = "level_cleansed" }
}
stage.static_labels {
values = {
cluster = "non-prod",
}
}
forward_to = [loki.write.local.receiver]
}
loki.write "local" {
...
}
Here are some example log lines:
[I 2025-05-07 14:13:01.123 ServerApp] 200 GET /user/john%20doe/api/contents...
[W 2025-05-07 14:13:05.456 ServerApp] Slow response for GET /user/...
[E 2025-05-07 14:13:07.789 ServerApp] Failed to authenticate user...
[D 2025-05-07 14:13:12.345 ServerApp] Debugging request handler for...
I always get the default value “I” from the template stage, so I believe the regex doesn’t match anything. If I remove the default value from the template, the level label is missing completely.
Any idea what could be wrong with the regex or the pipeline? Thanks in advance for your help.