Extract log with labels, also extract metrics with different labels

Hello,

I’ve been through the docs, trawled the forum and searched the web, I feel I must be being a bit dim… I have logs from a pfSense firewall being sent via syslog, Loki + Grafana all working fine. I’m new to this, really pleased with the investment in learning, decided to add some complexity and have come unstuck.

I’m parsing and extracting geoip data (working fine) and then refining down the labels to send with the log (also working fine), but when I added a metrics stage at the end, they appeared, but with the labels set, I have been stratching my head for two days trying to figure out how to create metrics with either no labels, or preferrably a couple of labels. I can have metrics, no labels but the log entry doesn’t get sent, or the log does get sent and the metrics have a load of labels.

I’m hoping someone here can help me with a couple of questions, and sincere apologies in advance if this has been answered on the forum and I have failed to find the answer!

  • can I have a single selector, with a regex to pull out into the extracted map, perform a geoip lookup to add labels (labeldrop some of the unneeded geoip labels) and then create a couple of metrics from other named groups from the regex?
  • is it possible to have two separate match selector entries in the promtail log, where the selector is the same as a separate pipeline stage?

I’ve tried different orders, dropping labels etc. but maybe there is a trick to that I’m too inexperienced to appreciate?

Thanks in advance for any guidance on this, it doesn’t feel that I’m trying to do something magical (parse a log line: extract geo data, clean up labels and create a couple of metrics) but maybe I’m asking too much of Loki/Promtail?

Config snip (I know the regex has more groups than are used here):

    pipeline_stages:
      - match:
          selector: '{job="syslog", application="filterlog"}'
          pipeline_name: extrator
          stages:
            - regex:
                expression: '^(?s)\<\d+\>\d+ (?P<logtime>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\+\d{2}:\d{2}) pfSense filterlog \d+ - - (?P<rulenumber>\d*),(?P<subrulenumber>\d*),(?P<anchor>[a-zA-Z0-9]*),(?P<tracker>\d*),(?P<interface>[a-zA-Z0-9_\-\.]+),(?P<logreason>(deny|permit|match)),(?P<decision>(pass|block|reject)),(?P<direction>(in|out)),(?P<ipversion>[46]),(?P<tos>0x\d+),(?P<ecn>[x0-9]*),(?P<ttl>\d+),(?P<id>\d+),(?P<offset>\d+),(?P<flags>[a-zA-Z]*),(?P<protocolid>(6|17)),(?P<protocol>(tcp|udp)),(?P<length>\d+),(?P<sourceip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}),(?P<destinationip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}),(?P<sourceport>\d+),(?P<destinationport>\d+),(?P<datalength>\d*),?(?P<tcpflags>[A-Z]+)?,?(?P<seq>\d+).*$'
            - timestamp:
                format: RFC3339
                source: logtime
            - geoip: # populates the geoip_ labels, so labeldrop those of no interest...
                db: /usr/local/var/loki/GeoLite2-City.mmdb
                db_type: city
                source: sourceip
            - labeldrop:
                - geoip_postal_code
                - geoip_timezone
                - geoip_subdivision_name
                - geoip_subdivision_code
            - labels:
                rulenumber:
                interface:
                decision:
                direction:
                sourceip:
                destinationport:
            - metrics:
                decisions_total:
                  type: Counter
                  description: "total number of firewall decisions"
                  prefix: firewall_
                  max_idle_duration: 24h
                  source: decision
                  config:
                    match_all: true
                    action: inc
                pass_total:
                  type: Counter
                  description: "total number of firewall pass decisions"
                  prefix: firewall_
                  max_idle_duration: 24h
                  source: decision
                  config:
                    value: pass
                    action: inc
                block_total:
                  type: Counter
                  description: "total number of firewall block decisions"
                  prefix: firewall_
                  max_idle_duration: 24h
                  source: decision
                  config:
                    value: block
                    action: inc

Example metric:

# HELP firewall_decisions_total total number of firewall decisions
# TYPE firewall_decisions_total counter
firewall_decisions_total{application="filterlog",decision="pass",destinationport="25",direction="in",facility="local0",geoip_city_name="Ashburn",geoip_continent_code="NA",geoip_continet_name="North America",geoip_country_name="United States",geoip_location_latitude="xxx",geoip_location_longitude="xxx",hostname="pfSense",interface="ix3",job="syslog",level="informational",procid="49139",rulenumber="141",source="remote",sourceip="xxxx"}

Cheers - Robert…

I have not tried to use promtail for metrics yet, so I may be speaking out of ignorance.

In general for Loki you want to have as few labels as possible, and you want to make sure the potential values of the labels are bounded (see Best practices | Grafana Loki documentation). With that in mind using sourceip as a label is most likely a bad idea, but on the other hand I recognize the need for it for metrics.

I propose potentially trying a different approach. Try focusing on having clean logs from promtail to Loki, and from Loki utilize ruler’s recording rules to parse and forward metrics to prometheus continuously. I think it would allow you to worry about logs only in your pipeline, while being able to tweak the Loki query for the metrics you want in prometheus.

Thanks for taking the time to reply. I get the cardinality thing, my issue isn’t to do with that though, I’m explicitly wanting to not label the metrics…

In that case perhaps someone more familiar with the metrics part of promtail can comment. My suggestion may still be worth a try.

I’m also confused on the best way to incorporate geoIP and firewall log data. I am using similar geoip stages to perform the decorations and keep source_ip, geoip_location_latitude, geoip_location_longitude as fields in my logs, converting them to numerical and using them in the world map panel just fine.

Now every few days my promtail clients are getting 429 answers from Loki that there are too many requests for tenant user: fake. If I try and set different tenant_ids per client or job, either one of loki/promtail fails at some point and it sounds like these items aren’t supposed to be extracted to fields/labels but that also seems to be the job of the geoip promtail stage??!?!!