Hello, i have this loki.process:
loki.process "syslog" {
forward_to = [loki.write.syslog.receiver]
stage.logfmt {
mapping = {src_ip = "src_ip"}
}
stage.logfmt {
mapping = {dst_ip = "dst_ip"}
}
stage.logfmt {
mapping = {dst_port = "dst_port"}
}
stage.logfmt {
mapping = {log_subtype = "log_subtype"}
}
stage.drop {
source = "msg"
value = "failed to decode logfmt"
}
// Extract src components
stage.regex {
expression = "src=(?P<src_ip>\\d+\\.\\d+\\.\\d+\\.\\d+):(?P<src_port>\\d+)(?::(?P<src_if>[^:\\s]+))?(?::(?P<src_fqdn>[^:\\s]+))?"
}
// Extract dst components
stage.regex {
expression = "dst=(?P<dst_ip>\\d+\\.\\d+\\.\\d+\\.\\d+):(?P<dst_port>\\d+)(?::(?P<dst_if>[^:\\s]+))?(?::(?P<dst_fqdn>[^:\\s]+))?"
}
// Remove src=... and dst=... from the message
stage.replace {
expression = "src=[^ ]+"
replace = ""
}
stage.replace {
expression = "dst=[^ ]+"
replace = ""
}
stage.drop {
expression = "dst=[^ ]+"
}
/*
// Set labels
stage.labels {
values = {
src_ip = "",
dst_ip = "",
src_port = "",
dst_port = "",
src_if = "",
dst_if = "",
src_fqdn = "",
dst_fqdn = "",
}
}
*/
// Drop empty labels
stage.match {
selector = "{src_if=\"\"}"
stage.label_drop {
values = ["src_if"]
}
}
stage.match {
selector = "{dst_if=\"\"}"
stage.label_drop {
values = ["dst_if"]
}
}
stage.match {
selector = "{src_fqdn=\"\"}"
stage.label_drop {
values = ["src_fqdn"]
}
}
stage.match {
selector = "{dst_fqdn=\"\"}"
stage.label_drop {
values = ["dst_fqdn"]
}
}
//stage.match {
// selector = "{dst=~\".+\"}"
// action = "drop"
//}
/*
stage.match {
selector = "{dst_fqdn=\"\"}"
stage.label_drop {
values = ["dst_fqdn"]
}
}
stage.match {
selector = "{src_if=\"\"}"
action = "drop"
}
stage.match {
selector = "{dst_if=\"\"}"
action = "drop"
}
stage.match {
selector = "{src_fqdn=\"\"}"
action = "drop"
}
stage.match {
selector = "{dst_fqdn=\"\"}"
action = "drop"
}
*/
stage.label_drop {
values = ["src", "dst", "src_if", "dst_if", "src_fqdn", "dst_fqdn", "src_ip", "dst_ip", "dst_port", "log_subtype"]
}
I have tried in vain to delete the labels dst
and src
, but even empty labels such as dst_if
do not work. I have tried several things but without success.
Is it possible that stage.structured_metadata
, although it is at the end, is processed before the whole delete action
and therefore it does not work?
here is an example log (except for deleting everything works so far):
Jan 3 13:45:36 192.168.5.1 id=firewall sn=000SERIAL time=“2007-01-03 14:48:06” fw=1.1.1.1 pri=6 c=262144 m=98 msg=“Connection Opened” n=23419 src=2.2.2.2:36701:WAN dst=1.1.1.1:50000:WAN proto=tcp/50000
Tony, thank you for your quick reply.
Yes, you were right, something was cut off incorrectly.
The task is actually simple.
After parsing dst
and src
with regex, the labels in the logfile dst
and src
should be deleted, as well as empty labels that are created when parsing with regex, e.g. dst_if
or dst_fqdn
.
How can i achieve this?
I have tried different variants, no chance, one could think that loki.process
does not process the batch according to the order, but that immediately after the values are published they are tranverted to stage.structured_metadata
before they are actually deleted, just a guess.
loki.process "syslog" {
forward_to = [loki.write.syslog.receiver]
/*
stage.logfmt {
mapping = {src = "src"}
}
stage.logfmt {
mapping = {dst = "dst"}
}
*/
stage.logfmt {
mapping = {src_ip = "src_ip"}
}
stage.logfmt {
mapping = {dst_ip = "dst_ip"}
}
stage.logfmt {
mapping = {dst_port = "dst_port"}
}
stage.logfmt {
mapping = {log_subtype = "log_subtype"}
}
stage.drop {
source = "msg"
value = "failed to decode logfmt"
}
// Extract src components
stage.regex {
expression = "src=(?P<src_ip>\\d+\\.\\d+\\.\\d+\\.\\d+):(?P<src_port>\\d+)(?::(?P<src_if>[^:\\s]+))?(?::(?P<src_fqdn>[^:\\s]+))?"
}
// Extract dst components
stage.regex {
expression = "dst=(?P<dst_ip>\\d+\\.\\d+\\.\\d+\\.\\d+):(?P<dst_port>\\d+)(?::(?P<dst_if>[^:\\s]+))?(?::(?P<dst_fqdn>[^:\\s]+))?"
}
// Remove src=... and dst=... from the message
stage.replace {
expression = "src=[^ ]+"
replace = ""
}
stage.replace {
expression = "dst=[^ ]+"
replace = ""
}
stage.drop {
expression = "dst=[^ ]+"
}
/*
// Set labels
stage.labels {
values = {
src_ip = "",
dst_ip = "",
src_port = "",
dst_port = "",
src_if = "",
dst_if = "",
src_fqdn = "",
dst_fqdn = "",
}
}
*/
// Drop empty labels
stage.match {
selector = "{src_if=\"\"}"
stage.label_drop {
values = ["src_if"]
}
}
stage.match {
selector = "{dst_if=\"\"}"
stage.label_drop {
values = ["dst_if"]
}
}
stage.match {
selector = "{src_fqdn=\"\"}"
stage.label_drop {
values = ["src_fqdn"]
}
}
stage.match {
selector = "{dst_fqdn=\"\"}"
stage.label_drop {
values = ["dst_fqdn"]
}
}
//stage.match {
// selector = "{dst=~\".+\"}"
// action = "drop"
//}
/*
stage.match {
selector = "{dst_fqdn=\"\"}"
stage.label_drop {
values = ["dst_fqdn"]
}
}
stage.match {
selector = "{src_if=\"\"}"
action = "drop"
}
stage.match {
selector = "{dst_if=\"\"}"
action = "drop"
}
stage.match {
selector = "{src_fqdn=\"\"}"
action = "drop"
}
stage.match {
selector = "{dst_fqdn=\"\"}"
action = "drop"
}
*/
stage.label_drop {
values = ["src", "dst", "src_if", "dst_if", "src_fqdn", "dst_fqdn", "src_ip", "dst_ip", "dst_port", "log_subtype"]
}
// Add structured metadata
stage.structured_metadata {
values = {
src_ip = "src_ip",
dst_ip = "dst_ip",
dst_port = "dst_port",
src_if = "src_if",
dst_if = "dst_if",
dst_fqdn = "dst_fqdn",
src_fqdn = "src_fqdn",
}
}
}
I tried your configuration with the sample log provided, and it wasn’t working for me at all. Your configuration also seems to be a bit too complicated. Several things:
- You don’t actually need all the logfmt stages, your sample log doesn’t have any of the fields (src_ip, dst_ip, dst_port, log_subtype).
- You already commented out the
labels
stage, so any of the label_drop stage after that is not doing anything useful.
- You have a stage to replace src and dst, my personal preference has always been to not alter the original logs unless there is a very good reason to do it, so I’d recommend against doing this.
This is what worked for me. I tested using the following sample log (one fabricated to test the fqdn and if regex):
Jan 3 13:45:36 192.168.5.1 id=firewall sn=000SERIAL time=“2007-01-03 14:48:06” fw=1.1.1.1 pri=6 c=262144 m=98 msg=“Connection Opened” n=23419 src=2.2.2.2:36701:WAN dst=1.1.1.1:50000:WAN proto=tcp/50000
Jan 3 13:45:36 192.168.5.1 id=firewall sn=000SERIAL time=“2007-01-03 14:48:06” fw=1.1.1.1 pri=6 c=262144 m=98 msg=“Connection Opened” n=23419 src=2.2.2.2:36701:WAN:some_fqdn dst=1.1.1.1:50000:WAN:some_fqdn proto=tcp/50000
Config:
loki.process "process_logs" {
forward_to = [<FORWARDER>]
stage.regex {
expression = `src=(?P<src_ip>\d+\.\d+\.\d+\.\d+):(?P<src_port>\d+)(?::(?P<src_if>[^:\s]+))?(?::(?P<src_fqdn>[^:\s]+))?`
}
stage.regex {
expression = `dst=(?P<dst_ip>\d+\.\d+\.\d+\.\d+):(?P<dst_port>\d+)(?::(?P<dst_if>[^:\s]+))?(?::(?P<dst_fqdn>[^:\s]+))?`
}
// Add structured metadata
stage.structured_metadata {
values = {
src_ip = "src_ip",
dst_ip = "dst_ip",
dst_port = "dst_port",
src_if = "src_if",
dst_if = "dst_if",
dst_fqdn = "dst_fqdn",
src_fqdn = "src_fqdn",
}
}
}
Thank you for the good analysis.
I have one more thing to say. Apart from the example log which comes from a Sonicwall, there are also logs from Sophos firewalls that contain exactly these labels, hence the mapping with logfmt. As already mentioned, it works with both log types Sonic / Sophos except for the deletion problem. I would like to understand why regardless of whether you manipulate or delete the original log data (I agree with you) you cannot delete these labels.
device_name="SFW" timestamp="2025-04-17T15:37:34+0200" device_model="X" device_serial_id="1111111" log_id="010202601001" log_type="Firewall" log_component="Invalid Traffic" log_subtype="Denied" log_version=1 severity="Information" fw_rule_id="N/A" nat_rule_id="0" fw_rule_type="NETWORK" ether_type="IPv4 (0x0800)" src_ip="15.17.234.131" src_country="R1" dst_ip="18.19.8.73" dst_country="R1" protocol="TCP" src_port=40620 dst_port=5274 hb_status="No Heartbeat" message="Could not associate packet to any connection." app_resolved_by="Signature" app_is_cloud="FALSE" qualifier="New" log_occurrence="1"
Do you have a reliable way to differentiate between those two logs? If so I’d recommend you to parse enough to determine which is which, set a label, then parse them separately with stage.match.
For example:
<do so parsing to set firewall_type>
stage.match {
selector = "{firewall_type=\"sonicwall\"}"
<REST OF STAGES>
}
stage.match {
selector = "{firewall_type=\"sophos\"}"
<REST OF STAGES>
}
Thanks again for the help.
Unfortunately, there is no unique identifier in the log files of the different firewalls. To me this looks very much like a bug, it must be possible, regardless of whether it makes sense to manipulate the labels, to delete them using my example.
My example is really very straightforward and should not be a problem for alloy.
I see in your other log there is a field device_name="SFW"
, can this be used as the differentiator?
If not, you can simply pass both logs through both logfmt and regex stage, and the empty key won’t get set in structured metadata.
Config I used:
stage.logfmt {
mapping = {
dst_ip = "dst_ip",
dst_port = "dst_port",
src_ip = "src_ip",
}
}
stage.regex {
expression = `src=(?P<src_ip>\d+\.\d+\.\d+\.\d+):(?P<src_port>\d+)(?::(?P<src_if>[^:\s]+))?(?::(?P<src_fqdn>[^:\s]+))?`
}
stage.regex {
expression = `dst=(?P<dst_ip>\d+\.\d+\.\d+\.\d+):(?P<dst_port>\d+)(?::(?P<dst_if>[^:\s]+))?(?::(?P<dst_fqdn>[^:\s]+))?`
}
// Add structured metadata
stage.structured_metadata {
values = {
src_ip = "src_ip",
dst_ip = "dst_ip",
dst_port = "dst_port",
src_if = "src_if",
dst_if = "dst_if",
dst_fqdn = "dst_fqdn",
src_fqdn = "src_fqdn",
}
}
Result:
If this is not your intent please share what outcome you’d like to see.
Hi Tony,
in your second example (Sonic) the label dst_fqdn
is present, but this is not always the case. If the label is missing in the log, an empty one is automatically generated by parsing with regex. How do i deal with this?
There is nothing inherently wrong with sending an empty label to Loki. Loki will just ignore a label if it has empty value.