How to keep only log lines that match a given regexp in alloy and send them to Loki

I’m currently setting up Alloy to behave as Promtail in order to ship logs to Loki.

For the time being, my Alloy configuration (converted from Promtail) looks really simple:

logging {
  level  = "info"
  format = "logfmt"
}

local.file_match "system" {
	path_targets = [{
		__path__    = "/var/log/myapp-*.log",
		sync_period = "5s",
	}]
}

loki.source.file "system" {
	targets               = local.file_match.system.targets
	forward_to            = [loki.write.default.receiver]
}

loki.write "default" {
	endpoint {
		url = "<loki_endpoint>"
	}
	external_labels = {}
}

Unfortunately, the logs of the application are not consistent… Some lines are coming from another application that does not have the same format:

create exclusive lock for repository
load indexes
check all packs
34 additional files were found in the repo, which likely contain duplicate data.
This is non-critical, you can run `restic prune` to correct this.
check snapshots, trees and blobs
[1:31] 100.00%  63 / 63 snapshots
no errors were found
2025/03/04 00:00:00 INFO  profile 'default': starting 'backup'
2025/03/04 00:51:51 INFO  profile 'default': finished 'backup'
2025/03/04 00:51:51 INFO  profile 'default': checking repository consistency
using temporary cache in /tmp/restic-check-cache-3813532076
create exclusive lock for repository
load indexes
check all packs
34 additional files were found in the repo, which likely contain duplicate data.
This is non-critical, you can run `restic prune` to correct this.
check snapshots, trees and blobs
[1:25] 100.00%  64 / 64 snapshots
no errors were found

Some lines (the ones that I want to keep) are starting with the following regexp:
^(\d{4}\/\d{2}\/\d{2} \d{2}:\d{2}:\d{2}) (\D*) (.*)$

I would like to:

  • filter out all the lines that do not match the regexp
  • add labels to the 3 capturing groups
    • group 1: timestamp
    • group 2: log_level
    • group 3: message

Is there a way I can achieve that ?
Unfortunately, the Alloy documentation is a bit overwhelming for a newcomer…

I would recommend you to not turn timestamp and log message into labels (see Label best practices | Grafana Loki documentation). And I also personally don’t like altering original log lines, so examples below will reflect my personal taste.

Your logic in loki.process will probably be something like this:

  1. Regex parse the log, get timestamp and log_level
  2. Use a match blcok and find logs with timestamp being empty, and drop them.
  3. Use a match block and find logs with timestamp not empty, set timestamp and log_level labels.

So it would be something like this (not tested):

loki.process "local_p" {
  forward_to = [<FORWARD>]
  
  stage.regex {
    expression = `^(?P<parsed_timestamp>\d{4}\/\d{2}\/\d{2} \d{2}:\d{2}:\d{2}) (?P<parsed_log_level>\D+)`
  }

  stage.match {
    selector            = "{parsed_timestamp!~\".+\"}"
    action              = "drop"
    drop_counter_reason = "format_mismatch"
  }

  stage.match {
    selector = "{parsed_timestamp=~\".+\"}"
    
    stage.timestamp {
      format = "RFC3339Nano" # pick a format that fits your actual timestamp
      source = "parsed_timestamp"
    }

    stage.labels {
      values = {
        log_level = "parsed_log_level",
      }
    }
  }
}
1 Like