If the logs are JSON, why do you use regex in stages? Could you not use json directly, something like this
# This stage is only going to run if the scraped target has a label of
# "name" with a value of "jaeger-agent".
- match:
selector: '{name="jaeger-agent"}'
stages:
# The JSON stage reads the log line as a JSON string and extracts
# the "level" field from the object for use in further stages.
- json:
expressions:
level: level
Just using the json formatter will not turn your http_status into an int. It is already valid JSON as a string and parsing it through any JSON parser e.g. jq will keep http_status as a string. You should be able to fix that in your regex expression in your pipeline stage.
I would have to see your regex. You would have to find a way to leave out the double quotes around the value of http_status. Numeric data types in JSON are not quoted.
That is annoying… Trying to read the docs to see if there is a way to drop the double quotes but did not find anything. I’m definitely no regex specialist.
Just putting it out there, why do we still stick to Apache log format for Nginx?? I know you can customize it to whatever you want and that is what I will do going forward. Can’t decide on json or logfmt. Sorry, rant over…
Not entirely sure how to do it tbh. I have not needed to deal with this in Promtail or Loki yet.
Did a quick test in Grafana and you could leave all the parsing to query time
This stage uses the Go JSON unmarshaler, which means non-string types like numbers or booleans will be unmarshaled into those types. The extracted data can hold non-string values and this stage does not do any type conversions; downstream stages will need to perform correct type conversion of these values as necessary. Please refer to the the template stage for how to do this.
Again, I don’t know exactly. how it works as I have not needed to do this myself yet.