How to rewrite label values?

We use promtail to scrape the systemd journal and label the logs by systemd unit. On some systems, a lot of session-<some-string>.scope units with very few logs are generated, which cause a lot of nearly-empty log chunks to be created. How can we rewrite the label value based on a regext, so that unit=session-abc123xyz.scope becomes unit=session.scope?

This should be possible with the pipeline stages in promtail.

The regex stage can operate on incoming labels by putting the label key in the source field

Use a new named capture group in your regex for just the part you want to keep

Then use the labels stage to set the value of the initial label to the name of the named capture group from the regex

  source: unit
  regex: (?P<new_unit>some regex to capture session-absfdsfds.scope into session.scope)
  unit: new_unit

Something like this, I wrote this quickly it might have errors.

As Iā€™m thinking about this, this also might not work the best because that regex would be hard to write, especially if you have a lot of different unit values.

Another approach would be to use the match stage with a selector like {unit=~"session.*"}

Then within the pipleine_stages of that match stage:

  source: new_unit
  template: "session.scope"
  unit: new_unit

I think this might work better.

1 Like

The second attempt works great, thanks a lot! Full job config for anyone looking for a working solution:

- job_name: journal
      job: systemd-journal
  - source_labels:
    - __journal__systemd_unit
    target_label: unit
  - match:
      selector: '{unit=~"session-.*\\.scope"}'
      - template:
          source: merged_scope_unit
          template: sesssion.scope
      - labels:
          unit: merged_scope_unit
1 Like