I’ve been trying to ingest my http logs from haproxy. I’ve setup a regex that should supposedly be able to parse that and create seperate labels for all the fields. I’m sure I’m dumb and have missed something, but here is an example of the haproxy log.
2025-02-10T21:20:16.231041-07:00 ark-lb-02 haproxy[10698]: 10.69.200.80:37206 [10/Feb/2025:21:20:16.217] httpsfront~ Emby/ark-emby-02 0/0/0/12/13 200 31118 - - ---- 19/18/0/0/0 0/0 {node|emby.mysticturtles.com} "GET /Shows/1489112/Seasons HTTP/1.1"
Here is my config.alloy
livedebugging {
enabled = true
}
logging {
level = "debug"
format = "logfmt"
}
local.file_match "haproxy_files" {
path_targets = [{"__path__" = "/var/log/haproxy.log"}]
sync_period = "5s"
}
loki.source.file "log_scrape_haproxy" {
targets = local.file_match.haproxy_files.targets
forward_to = [loki.process.extract_haproxy_logs.receiver]
tail_from_end = true
}
loki.process "extract_haproxy_logs" {
stage.regex {
expression = `^(?P<timestamp>[\d\-T\:\.]+-[\d\:\+]+)\s+(?P<hostname>[\w\-\d]+)\s+(?P<process>[\w\-\d]+)\[(?P<pid>\d+)\]:\s+(?P<client_ip>[\d\.]+):(?P<client_port>\d+)\s+\[(?P<request_time>[^\]]+)\]\s+(?P<frontend>[\w\-~]+)\s+(?P<backend>[\w\-\/]+)\s+(?P<timing>(?:-?\d+\/)+-?\d+)\s+(?P<status_code>-?\d+)\s+(?P<response_size>\d+)\s+(?P<unknown1>[-\w]*)\s+(?P<unknown2>[-\w]*)\s+(?P<tcp_flags>[\w\-]+)\s+(?P<connection_info>[\d\/]+)\s+(?P<queue_info>[\d\/]+)\s+{(?P<host_url>[^}]+)}\s+"(?P<http_method>\w+)\s+(?P<url>[^\s]+)\s+(?P<http_version>HTTP\/[\d\.]+)"`
}
forward_to = [loki.write.grafana_loki.receiver]
}
loki.write "grafana_loki" {
endpoint {
url = "URL_FOR_LOKI"
}
}
I do see the logs inside Loki, but I only see the filename label.
I have been testing the regex https://re2js.leopard.in.ua/ and it does seem to work and select all the fields correctly