Loki works like a charm for log analysis, however, when it’s about creating analytics dashboard for statistics, I face performance problems.
I may ask too much from it considering I’m using Ingester, Querier and Query Frontend on the same machine but I’m wondering if I couldn’t optimize queries.
My use case is pretty simple: I have several nodes serving API requests using nginx logs. The nginx logs are a bit tweaked to get requests details (including body, that’s the most storage intensive part) and geo location (from maxmind).
But the data volume is not that huge, consider this on a 24h time range:
logs size = 19 Gb
log lines = 21 million
But a query like this one, that seems optimized to me, takes a very long time and causes many of my panels to timeout:
sum by (geoip2_data_country_code) (count_over_time({job=~"nginx",node_name=~"node1|node2|node3|node4|node5|node6"} | json | geoip2_data_country_code != "" | __error__="" [$__interval]))
On 12 hours, it takes 20.2 s to respond.
Details below
Total request time
20.2 s
Number of queries
1
Total number rows
80291
Data source stats
Summary: bytes processed per second
527 MB/s
Summary: lines processed per second
552185
Summary: total bytes processed
10.2 GB
Summary: total lines processed
10654645
Summary: exec time
19.3 s
Ingester: total reached
64
Ingester: total chunks matched
17
Ingester: total batches
3057
Ingester: total lines sent
1559731
Ingester: head chunk bytes
0 B
Ingester: head chunk lines
0
Ingester: decompressed bytes
0 B
Ingester: decompressed lines
0
Ingester: compressed bytes
0 B
Ingester: total duplicates
0
Is there any way to reduce this time by optimizing query?
If no, what’s the best infra setup to get it to respond faster?
Are there log lines that do not contain geoip2_data_country_code ? Then putting |= "geoip2_data_country_code" before the json parser should help. Then only logs that actually contain geoip2_data_country_code would have to be parsed.
If all logs contain geoip2_data_country_code then another possibility would be to filter out the empty geoip2_data_country_code values with something like != “geoip2_data_country_code”: “”` (if that is how they are logged when empty).
The main thing would be to pass as few lines to the json parser as possible.
I don’t personally run Loki on a single instance, but in general if you are trying to optimize for one single Loki instance, then you should want to make sure that you write as few chunks to the storage as you reasonably can (you obviously don’t want to keep chunk in memory for too long either), and you want to make sure to split your query as little as possible as well, this usually gives you the biggest gain.
With Loki you parse logs at query time. Logs are just plain text strings before you add e.g. the | json parser in your query. There is nothing stopping you from using a line filter before parsing logs. Put the string inside backticks and you don’t even have to worry about escaping any characters.