Getting metrics out of logs when the logline is a list of values

Bit of a loki newb here, wondering if what I am trying to do is even possible. Given a log line like this:
spamd[181303]: spamd: result: . 3 - DKIMWL_WL_MED,DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,HTML_FONT_LOW_CONTRAST,HTML_MESSAGE,HTTP_ESCAPED_HOST,RCVD_IN_BL_SPAMCOP_NET,RCVD_IN_MSPIKE_H2,SHOPIFY_IMG_NOT_RCVD_SFY,SNF4SA,SONIC_BX_A2,SONIC_CA,SPF_HELO_NONE,URI_HEX scantime=1.2,size=64270,user=erg,uid=99,required_score=5.0,rhost=157.131.224.146,raddr=157.131.224.146,rport=49320,mid=vE-Z7BXnR-myBlPxIRcQsg@geopod-ismtpd-33,autolearn=disabled,shortcircuit=no

I want to create a metric that would allow me to see things like “Which rule has been hit the most in a time period”. The rules in this case are the comma separated values like RCVD_IN_MSPIKE_H2 or URI_HEX. Seems like an incrementing counter, but I am not sure how to get that list of rules into the extract map, or how to then create a metric where the name of the metric is from these extracted values?

I actually ran into something similar recently, and I wasn’t able to find a solution. What I ended up doing is using logstash’s split function and split one log lines into multiple ones. For example, assuming RCVD_IN_MSPIKE_H2 = rule1,rule2 given logline:

{...}RCVD_IN_MSPIKE_H2="rule1,rule2",{...}rhost=157.131.224.146,{...}

will then become:

```{...}RCVD_IN_MSPIKE_H2="rule1",{...}rhost=157.131.224.146,{...}```
```{...}RCVD_IN_MSPIKE_H2="rule2",{...}rhost=157.131.224.146,{...}```

There is the obvious implication of exponential increase in log volume, but without a better solution this was all I could do.

Couple of considerations I’d add:

  1. It’s still valuable, in my opinion, to maintain the “source of truth”, meaning I’d still keep the original log intact in a separate log stream (maybe use a different filename label or something).

  2. Given the purpose of the split logs being solely for aggregation, you may want to strip out the fields that aren’t of interest, potentially offset the increase log volume somewhat.

If someone has a more elegant solution I’d love to hear it too.

If that is the case I will most likely write something in perl to use as an exporter for the metrics. Splitting the logline would for sure solve my first problem when it comes to creating the export map, but I think I would still need to define each metric in the config. Since I want a separate metric counter for each rule, I think I would need to define 1k+ individual metrics in the promtail config. Ill look into logstash though

1 Like

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.