Loki Cannot parse Json correct but is valid

Hi, I am getting this error “error”: “JSONParserErr” when i try to parse Json,i check the json is valid and i cannot find any other logs.

{
	"agent": "-",
	"endpoint": "/",
	"env": "test",
	"filename": "/var/log/nginx/access.log",
	"hostname": "test1",
	"http_method": "GET",
	"http_status": "200",
	"ip": "1.1.1.1",
	"job": "nginx",
	"response_time": "10"
}

Promtail regex

pipeline_stages:
    - match:
        selector: '{job="nginx"}'
        stages:
        - regex:
            expression: '...'
    - labels:
        ip:

Hi @xristoforosdeme ,

at least on my machine, jq also has problems parsing that JSON.

$ jq .
{
  "agent": "-",
  "env": "live",
  "filename": "/var/log/nginx/access.log",
  "hostname": "<>",
  "job": "nginx",
}
parse error: Expected another key-value pair at line 7, column 1

The problem is the trailing comma after "job": "nginx". Removing that makes jq happy.

1 Like

How is this data generated ?

Sorry i remove some labels because is sensitive, i edit the post to remove ‘,’ on the last line.

I have promtail running on servers and send to loki.

Promtail have some regex

Example:


pipeline_stages:
    - match:
        selector: '{job="nginx"}'
        stages:
        - regex:
            expression: '...'
    - labels:
        ip:
1 Like

Can you validate your json here

Or using notepad++ with json plugin

I test it is show valid i will update it with the correct and i will just remove the values

Are your json logs coming in in a single line, or multiple lines like your origin post?

One line i just copy it with the format to be more readable.

If the logs are JSON, why do you use regex in stages? Could you not use json directly, something like this

  # This stage is only going to run if the scraped target has a label of
  # "name" with a value of "jaeger-agent".
  - match:
      selector: '{name="jaeger-agent"}'
      stages:
      # The JSON stage reads the log line as a JSON string and extracts
      # the "level" field from the object for use in further stages.
      - json:
          expressions:
            level: level

From Pipelines | Grafana Loki documentation

The logs are not in json format, they are just logs in txt format. The output from grafana is JSON. I am wrong how to use JSON format in grafana ?

From the OP, when you say “when i try to parse Json”, what exactly does that mean?

This happens when you query your logs in Grafana Explore? Could you share your LogQL query, that would make it easier to see what is happening.

Original output from grafana with

{env="test", job="nginx"}

{ "agent": "-", "endpoint": "/", "env": "test", "filename": "/var/log/nginx/access.log", "hostname": "test1", "http_method": "GET", "http_status": "200", "ip": "1.1.1.1", "job": "nginx", "response_time": "10" }

output with {env="test", job="nginx"} | json

{ "__error__": "JSONParserErr", "agent": "-", "endpoint": "/", "env": "test", "filename": "/var/log/nginx/access.log", "hostname": "test1", "http_method": "GET", "http_status": "200", "ip": "1.1.1.1", "job": "nginx", "response_time": "10" }

My issue is this but i just need to convert string to integer values like http_status. But also i don’t understand why i have the issue with JSON

Just using the json formatter will not turn your http_status into an int. It is already valid JSON as a string and parsing it through any JSON parser e.g. jq will keep http_status as a string. You should be able to fix that in your regex expression in your pipeline stage.

You should be able to fix that in your regex expression in your pipeline stage.

But how i can fix it, to convert it to integer ? I try \d+ but same things is string.

I would have to see your regex. You would have to find a way to leave out the double quotes around the value of http_status. Numeric data types in JSON are not quoted.

Log don’t have double code

Regex

^(?P<ip>([a-z0-9]+|\.|:)*) (?P<response_time>\d+) .*] "(?P<http_method>[A-Z]+) (?P<endpoint>\/.*) .*\" (?P<http_status>\d+) \d+ "(?P<agent>.*)" .*

Logs

1.1.1.1 10 - [15/Dec/2022:08:01:34 +0000] "GET /api HTTP/1.1" 200 893 "https://testing.test/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36"

That is annoying… Trying to read the docs to see if there is a way to drop the double quotes but did not find anything. I’m definitely no regex specialist.

Just putting it out there, why do we still stick to Apache log format for Nginx?? I know you can customize it to whatever you want and that is what I will do going forward. Can’t decide on json or logfmt. Sorry, rant over…

Not entirely sure how to do it tbh. I have not needed to deal with this in Promtail or Loki yet.

Did a quick test in Grafana and you could leave all the parsing to query time

Example log

	
1.240.104.50 - - [15/Dec/2022:13:42:17 +0000] "GET / HTTP/2.0" 401 0 "https://1.19.226.215/" "Blackbox Exporter/0.20.0" 0 0.014 [system-monitoring-alertmanager-main-9093] [] 1.203.181.161:443 0 0.012 401 dfb7289acf1c01310c29450c8f93d664

This could be parsed in Grafana with

{namespace="system-ingress"} | pattern "<ip> - - <_> \"<method> <path> <proto>\" <http_status> <rest>" | http_status > 400

What you need is described here

This stage uses the Go JSON unmarshaler, which means non-string types like numbers or booleans will be unmarshaled into those types. The extracted data can hold non-string values and this stage does not do any type conversions; downstream stages will need to perform correct type conversion of these values as necessary. Please refer to the the template stage for how to do this.

Again, I don’t know exactly. how it works as I have not needed to do this myself yet.