Dynamic parsing in promtail

Hi. we are moving away from DataDog and there are several rough edges still in progress. Major features are already achieved, but still some minor ones missing.

We are getting NGINX Json logs from the containers. We have this particular common field “args” which represents the query http URL args portion.

i.e.: v=11&from=52.33877%2C4.87277&to=52.33812%2C4.8687&modes=wa_wal&onlyAllowModes=wa_wal&ws=1&includeStops=true

In DD we were getting out the box this nice parsing:

I am trying to pull off a similar structure, but I am struggling, and I starting to think there is no way using promtail/loki alone?

Can it be detected on the fly, convert it to Json (or metadata?) and added back to the entry?

I can easily define this regex for instance:


but cannot be applied in regex pipeline since it works only with named groups.

Is there any other way to dynamically achieve this? In fact Promtail is quite short on dynamic label/fields handling, the only feature we are using related to this is the labelmap relabel config. No much else you can do apparently?

Hi. Without know exactly what you hope to achieve, I want to say that dynamic labels with Loki can lead to performance issues.

Loki is not a fully indexed storage. Parsing of logs happen at query time.

Also, what exactly is the difference between the Loki json parser format and DD? To me, the screenshot looks like a “normal” JSON parsed log. Is that something that the DD agent inserts into the log? Or does you logs in Loki also contain it but it is displayed differently in the Grafana UI?

Hi, thanks for the quick answer.
I have changed the tittle since it was misleading. You are right, I am not trying to pull off dynamic labels. Just creating dynamic json content, so later on can be parsed and treated as labels on the fly.

The DD screen shows how the out the box the NGINX integration treats the args fields. Logs are sent in raw json format, and args it is just one more string field in the json entry. Same as here. But in there, we get a NIGNX integration that deals with their logs, parses it, process it using. It has a multi stage parser dealing with URLs, user agent, etc. In one of those steps this sub json is created.

My original idea was to achieve something like that at the Promtail stage, and then just sent the modified final json result to Loki, so there is no need to extra process on the fly all the time. But I cannot find a way to parse that json field, obtain dynamicaly those params, and create a json structure to attach back to the original log line before send it to Loki

I see. I have not tried to do anything like that so I’m of no help with that unfortunately. I guess you can’t configure Nginx to do it for you? Maybe with $args

Apologies if I’m way off. I always try to have applications produce logs as close to the final format I want, if at all possible. Having logging pipelines parsing logs correctly can be frustrating.


Yes, in fact that json field I want to parse comes form NGINX’s $args. There is also arg_name which I could use, but I need to know which fields want to log before hand, and will create an empty field for each arg not present in each log line. Still I agree with you on trying to get the application the closer to the final log. But NGINX is the one causing us lots of troubles (not even mentioned NGINX’s primitive error logs here).

One in between solution (a bit better than defining name arg_name) would be to parse some specific fields using my above mentioned regex


but with named groups for each arg I want to keep, and with that generate a static json structure. Not ideal, but maybe the closest I can get.

Another approach could be to try to do it on the fly during query time. Not a big fun of this, but maybe there is a way to get it done dynamically there. Also it seems than the whole concept here is around this powerful idea of parsing in the fly.
A basic query I was playing with was:

{job="nginx", instance="my-server"} | json | __error__ = "" | args =~ ".* from=.*" | label_format selected_arg_label="{{ regexReplaceAll `(.*)from=(?P<from_group>[^&]+)(.*)` .args `${from_group}`}}"

This is probably a poor implementation of getting the from arg field at and create a label out of it on the fly. Help here is appreciated too. There must be a cleaner way to achieve this.

I think this is an interesting problem. It’s easy enough to parse with pattern or regexp filter if the order of each arguments in the URL is static, but it’s not or should not be expected to stay static, and therefore it has to be dynamic.

Try something like this. We know the url field almost fits the logfmt filter, but not quite because the separator is & instead of space, but what if we replace it first? Something like this:

| json | line_format `{{ .url | replace "&" " " }}` | logfmt

See example here: LogQL Analyzer | Grafana Loki documentation

Nice, that is slick! I never looked at the logfmt as a possibility here. But it just works as a charm.

Still a shame I cannot pull this off in Promtail, but nevertheless this works perfectly and does what I need.

Thanks a lot guys!

I think you can do roughly the same thing in promtail actually. Something like (mock logic):

  - json:

  - template:
      source: url
      template: '{{ Replace .Value "&" " " }}'

  - logfmt:
      source: url

  - labels:
      # set labels here

Do note that not all the arguments in your url would make good labels. I would consider perhaps turning them into structured metadata (disclaimer, i haven’t had the opportunity to explore this yet, see structured_metadata | Grafana Loki documentation).

Hi there. Thanks for the follow up.
I have been playing around a bit with your suggested Promtail pipeline.

For make the template it work I had to add a -1, otherwise I think it considers a default 0 and ends up replacing nothing.

Then, reading the docs I am supposed to define the logfmt without the mapping, so all available fields get extracted. Well, I could not manage to achieve that. All I get is:

“invalid logfmt stage config: logfmt mapping is required”

a bug maybe? Or myself missing something. Tried to define a empty/null mapping field but no luck so far.

In any case, even getting those values on the fly is half of the work achieved. I don’t want to create labels (that in the best escenario have to be statically defined?), I’d like to put it back as json fields in the message (or as a structured metadata as you mentioned above, haven’t tried yet neither). That would be my desire configuration for this case.