We recently upgraded to Kubernetes 1.24 and Promtail is no longer gathering logs in JSON format
I’m guessing this is due to Dockershim being deprecated in 1.24 such that Docker may no longer be outputting the JSON file format it used to.
The symptom is that outputting logs to the json formatter results in an error:
{namespace=~".+"} | json
produces
pipeline error: 'JSONParserErr' for series: '{__error__="JSONParserErr", <<REDACTED>>, stream="stderr"}'. Use a label filter to intentionally skip this error. (e.g | __error__!="JSONParserErr"). To skip all potential errors you can match empty errors.(e.g __error__="") The label filter can also be specified after unwrap. (e.g | unwrap latency | __error__="" )
I tried using template to output JSON but it seems that template only outputs strings, even though it might look like JSON.
Per @dylanguedes1 promtail does not support the feature of converting non-JSON logs to JSON. Since we are using EKS configuring our Kubernetes to use dockershim is not an option in 1.24 - see Amazon EKS ended support for Dockershim - Amazon EKS . I’m going to have to drop using the json filter in favor of pattern in my dashboards and hope that in time promtail might add this feature. I will link back a feature request once I create it.
What is your current Promtail pipeline_stages config?
We are not using Kubernetes v1.24 yet so I have not faced this issue but I would still expect that container logs would still keep their standard Docker log format
And as far as I know, Loki always just stores plain strings. If your logs are in JSON or logfmt format, then you can parse them using the built-in parsers at query time. Of course Promtail can parse logs to extract fields to assign values to labels etc. but the logs stored in Loki are plain strings (except for the labels in the index).
Based on the error message, it looks like error logs (the stderr stream) is not structured JSON. You should be able to see that with a query like {namespace=~".+", stream="stderr"} without adding the | json parser.
The pipeline stages config I landed on (from the Helm values file) just uses cri
pipelineStages:
- cri: {}
however I’m not outputting anything from it so I don’t think this really changes much. Based on tailing the logs from the containers they are not outputting in JSON format at all. I was hoping that Promtail would be able to capture the groups based on a regex or similar and output them in a true JSON format for Loki to parse with the json filter. I did try using template to output to a structured JSON formatted string, however Loki still failed to parse this with the json filter.
The pipeline stages I used to try outputting JSON formatted strings:
however this seemed to make no difference. I assumed it was because it was still a string rather than JSON under the hood, but there may be more to it, such as the timestamp not being included.
As these are Kubernetes logs I would definitely try using docker instead of cri as your first stage. That will not magically turn your logs into JSON but it gives you the basic log, time and stream fields parsed.
From years of working in the team looking after the centralised logging platform we use, I am a firm believer in having the application code format the logs into whatever format they should be. It might be technically possible to do in the agent or some later ingestion stage, depending on what system you use, but that is so much less efficient and prone to parsing errors than instrumenting logging in the application code.
Maybe I’m misunderstanding how cri and docker are used with Kubernetes. For example, I was under the impression that the docker stage was specifically for use with docker-formatted logs which I assumed would go away once dockershim was deprecated. Whereas with CRI this is a log format that is standardized for use with any container runtime - sourced from Logging Architecture | Kubernetes - such that no matter whether I’m using docker or something else it should all funnel into CRI with Kubernetes.
I admit that having promtail generate JSON from raw logs is a workaround for what is ultimately a design decision on the part of the Kubernetes authors. Maybe the question becomes whether this use case of preserving stability of dashboards and monitoring built on JSON outweighs the risk of inefficiency and possible parsing errors. In our case it might not - we only have a couple of dashboards and alerts that use the json filter. However considering that more and more people will be upgrading to K8s 1.24 there may be others who have more entrenched usage of this filter.
I think you are right regarding the cri vs. docker stage. As I mentioned, this is not something I have encountered yet. Good to know what to prepare for.
Now, after reading up a bit on CRI, the output differences make perfect sense. The cri output is the same as the docker output but without time and stream. So looks like cri is the one to use.