Log analytics performance: is my LogQL request optimised?

Loki works like a charm for log analysis, however, when it’s about creating analytics dashboard for statistics, I face performance problems.
I may ask too much from it considering I’m using Ingester, Querier and Query Frontend on the same machine but I’m wondering if I couldn’t optimize queries.

My use case is pretty simple: I have several nodes serving API requests using nginx logs. The nginx logs are a bit tweaked to get requests details (including body, that’s the most storage intensive part) and geo location (from maxmind).

But the data volume is not that huge, consider this on a 24h time range:

  • logs size = 19 Gb
  • log lines = 21 million

But a query like this one, that seems optimized to me, takes a very long time and causes many of my panels to timeout:

sum by (geoip2_data_country_code) (count_over_time({job=~"nginx",node_name=~"node1|node2|node3|node4|node5|node6"} | json | geoip2_data_country_code != "" | __error__="" [$__interval]))

On 12 hours, it takes 20.2 s to respond.
Details below

Total request time 20.2 s
Number of queries 1
Total number rows 80291
Data source stats
Summary: bytes processed per second 527 MB/s
Summary: lines processed per second 552185
Summary: total bytes processed 10.2 GB
Summary: total lines processed 10654645
Summary: exec time 19.3 s
Ingester: total reached 64
Ingester: total chunks matched 17
Ingester: total batches 3057
Ingester: total lines sent 1559731
Ingester: head chunk bytes 0 B
Ingester: head chunk lines 0
Ingester: decompressed bytes 0 B
Ingester: decompressed lines 0
Ingester: compressed bytes 0 B
Ingester: total duplicates 0

Is there any way to reduce this time by optimizing query?
If no, what’s the best infra setup to get it to respond faster?

Thanks for your help!

Are there log lines that do not contain geoip2_data_country_code ? Then putting |= "geoip2_data_country_code" before the json parser should help. Then only logs that actually contain geoip2_data_country_code would have to be parsed.

If all logs contain geoip2_data_country_code then another possibility would be to filter out the empty geoip2_data_country_code values with something like != “geoip2_data_country_code”: “”` (if that is how they are logged when empty).

The main thing would be to pass as few lines to the json parser as possible.

What does your infrastructure look like from resource allocation? How many reader / queriers do you have and how much resource are they given?

Thanks for the answers.

@b0b the geoloc data is in json so I can’t filter it unfortunately.
But it doesn’t change much, most of lines have it filled actually.

@tonyswumac I have simply one reader and querier on the same server. Resource allocation is not limited, I’m using binaries.

I don’t personally run Loki on a single instance, but in general if you are trying to optimize for one single Loki instance, then you should want to make sure that you write as few chunks to the storage as you reasonably can (you obviously don’t want to keep chunk in memory for too long either), and you want to make sure to split your query as little as possible as well, this usually gives you the biggest gain.

Please see my comments on another thread here: Loki full CPU for a long time and need restart to work again - #6 by tonyswumac. It has some general recommendations there.

1 Like

With Loki you parse logs at query time. Logs are just plain text strings before you add e.g. the | json parser in your query. There is nothing stopping you from using a line filter before parsing logs. Put the string inside backticks and you don’t even have to worry about escaping any characters.