JSONParserError on json formatter

sarasensible · February 7, 2023, 2:17pm

We recently upgraded to Kubernetes 1.24 and Promtail is no longer gathering logs in JSON format
I’m guessing this is due to Dockershim being deprecated in 1.24 such that Docker may no longer be outputting the JSON file format it used to.

The symptom is that outputting logs to the json formatter results in an error:

{namespace=~".+"} | json

produces

pipeline error: 'JSONParserErr' for series: '{__error__="JSONParserErr", <<REDACTED>>, stream="stderr"}'. Use a label filter to intentionally skip this error. (e.g | __error__!="JSONParserErr"). To skip all potential errors you can match empty errors.(e.g __error__="") The label filter can also be specified after unwrap. (e.g | unwrap latency | __error__="" )

I tried using template to output JSON but it seems that template only outputs strings, even though it might look like JSON.

      - cri: {}
      - regex:
          expression: "^(?P<content>.*)$"
      - template:
          source: content
          template: '{ "log": "{{ .Value }}" }'
      - output:
          source: content

How can I get the json formatter working again?

sarasensible · February 7, 2023, 9:53pm

Per @dylanguedes1 promtail does not support the feature of converting non-JSON logs to JSON. Since we are using EKS configuring our Kubernetes to use dockershim is not an option in 1.24 - see Amazon EKS ended support for Dockershim - Amazon EKS . I’m going to have to drop using the json filter in favor of pattern in my dashboards and hope that in time promtail might add this feature. I will link back a feature request once I create it.

sarasensible · February 8, 2023, 4:08pm

Feature request Promtail - Pipeline stage to convert non-JSON formatted logs to JSON formatted · Issue #8465 · grafana/loki · GitHub

b0b · February 9, 2023, 1:16pm

What is your current Promtail pipeline_stages config?

We are not using Kubernetes v1.24 yet so I have not faced this issue but I would still expect that container logs would still keep their standard Docker log format

And as far as I know, Loki always just stores plain strings. If your logs are in JSON or logfmt format, then you can parse them using the built-in parsers at query time. Of course Promtail can parse logs to extract fields to assign values to labels etc. but the logs stored in Loki are plain strings (except for the labels in the index).

Based on the error message, it looks like error logs (the stderr stream) is not structured JSON. You should be able to see that with a query like {namespace=~".+", stream="stderr"} without adding the | json parser.

sarasensible · February 9, 2023, 3:03pm

@b0b Thanks so much for the response!

The pipeline stages config I landed on (from the Helm values file) just uses cri

    pipelineStages:
      - cri: {}

however I’m not outputting anything from it so I don’t think this really changes much. Based on tailing the logs from the containers they are not outputting in JSON format at all. I was hoping that Promtail would be able to capture the groups based on a regex or similar and output them in a true JSON format for Loki to parse with the json filter. I did try using template to output to a structured JSON formatted string, however Loki still failed to parse this with the json filter.

The pipeline stages I used to try outputting JSON formatted strings:

      - cri: {}
      - regex:
          expression: "^(?P<content>.*)$"
      - template:
          source: content
          template: '{ "log": "{{ .Value }}" }'
      - output:
          source: content

however this seemed to make no difference. I assumed it was because it was still a string rather than JSON under the hood, but there may be more to it, such as the timestamp not being included.

Any guidance would be greatly appreciated.

b0b · February 9, 2023, 3:51pm

As these are Kubernetes logs I would definitely try using docker instead of cri as your first stage. That will not magically turn your logs into JSON but it gives you the basic log, time and stream fields parsed.

From years of working in the team looking after the centralised logging platform we use, I am a firm believer in having the application code format the logs into whatever format they should be. It might be technically possible to do in the agent or some later ingestion stage, depending on what system you use, but that is so much less efficient and prone to parsing errors than instrumenting logging in the application code.

sarasensible · February 9, 2023, 5:49pm

Hmmm, I tried using the docker stage and the logs look less cleanly formatted than with the cri stage.

With docker stage:

2023-02-09T17:36:12.693584297Z stdout F e[32mINFOe[0m:     2023-02-09 17:36:12,693 :: 10.0.44.236:44434 - "e[1mGET /docs HTTP/1.1e[0m" e[32m200 OKe[0m

With cri stage:

e[32mINFOe[0m:     2023-02-09 17:36:24,373 :: 10.1.34.41:42946 - "e[1mGET /docs HTTP/1.1e[0m" e[32m200 OKe[0m

Maybe I’m misunderstanding how cri and docker are used with Kubernetes. For example, I was under the impression that the docker stage was specifically for use with docker-formatted logs which I assumed would go away once dockershim was deprecated. Whereas with CRI this is a log format that is standardized for use with any container runtime - sourced from Logging Architecture | Kubernetes - such that no matter whether I’m using docker or something else it should all funnel into CRI with Kubernetes.

I admit that having promtail generate JSON from raw logs is a workaround for what is ultimately a design decision on the part of the Kubernetes authors. Maybe the question becomes whether this use case of preserving stability of dashboards and monitoring built on JSON outweighs the risk of inefficiency and possible parsing errors. In our case it might not - we only have a couple of dashboards and alerts that use the json filter. However considering that more and more people will be upgrading to K8s 1.24 there may be others who have more entrenched usage of this filter.

b0b · February 10, 2023, 8:30am

I think you are right regarding the cri vs. docker stage. As I mentioned, this is not something I have encountered yet. Good to know what to prepare for.

Now, after reading up a bit on CRI, the output differences make perfect sense. The cri output is the same as the docker output but without time and stream. So looks like cri is the one to use.

system · February 10, 2024, 8:30am

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
What is the correct way to parse json logs in loki, promtail Grafana Loki	7	23523	August 14, 2023
Promtail stages docker and multiline Grafana Loki regex	2	4252	February 17, 2023
Why is the `json` parser failing? Grafana Loki	6	8515	September 17, 2022
Loki Cannot parse Json correct but is valid Grafana Loki	17	5450	December 15, 2023
Promtail not parsing logs Grafana Loki promtail	4	1857	February 3, 2024

JSONParserError on json formatter

Related topics