Using W3C Extended Log File Format With "Pattern" Parser

Greetings:

I am new to Loki and am trying to ingest some web servers logs that are formatted in W3C Extended Log File format. My particular instance is tab-delimited. Here is an example of a log line:

192.168.0.20	2022-01-26	16:19:36	GET	/wls-exporter/metrics	200	0.003

Unfortunately, I am unable to add keys to this data, so Loki is unable to parse out the fields automatically with logfmt. I read about the Pattern parser in the documentation, but it doesn’t seem to work to get individual values out of my entries. Here’s the patterns I’ve tried so far:

"<client-ip> <date> <time> <method> <uri> <status> <resp-time>"
"<client-ip> <_> <date> <_> <time> <_> <method> <_> <uri> <_> <status> <_> <resp-time>"
"<_> <client-ip> <date> <time> <method> <uri> <status> <resp-time>"

The first entry was the closest, but it’s putting all values into the “client_ip” field which isn’t what I want:

client_ip	192.168.0.20 2022-01-26 16:19:36 GET /wls-exporter/metrics 200 0.003 

Can someone point me toward a way of getting the pattern parser to split the fields up, or explain what I’m doing wrong?

Thanks for the help!

I did manage to solve this issue.

To get the fields to split properly you need to specify the delimiter in the pattern parser (in this case \t):

pattern "<client_ip>\t<date>\t<time>\t<method>\t<uri>\t<status>\t<resp_time>"

This will leave the newline on the end of the last field however, so that needs to be stripped using label_format if you want to use that value in a metric query:

| label_format resp_time=`{{trimAll "\r" .resp_time }}`

After that everything was working as expected:

quantile_over_time(0.99, {job="demo_access_log"} !="wls-exporter" | pattern "<client_ip>\t<date>\t<time>\t<method>\t<uri>\t<status>\t<resp_time>" | label_format resp_time=`{{trimAll "\r" .resp_time }}` | unwrap resp_time | __error__="" [24h]) by (job)

Also it appears that you can’t use the “-” character in a field name using the pattern parser. When I switched to using “_” in the field names was when I started to get results. Not sure if that’s something I was doing wrong in the query or if that’s expected behavior, but something to keep in mind.

Thanks to everyone who viewed the post!