How to calculate standard deviations when loki is slow?

kamilcuk · September 17, 2024, 11:11am

I want to calculate standard deviations of some csv data stored in loki:

logcli instant-query '
  stddev_over_time(
       {job=....}
       | regexp `^(?P<hhm>09:[345]|1[0-5])[^,]*,MinMaxAvg,(?P<category>[^,]{4})[^,]*,(?P<name>[^,]+),(?P<count>[^,]+)'
       | unwrap count [4d] ) by (job, category, name)
 )
'

However, for much data it just fails with EOF:

....+unwrap+avg+%5B4d%5D+%29+by+%28name%2C+hour%29%0A++++++++&time=1726569532005034620": EOF
2024/09/17 06:39:35 Query failed: run out of attempts while querying the server

How can I “parallelize” getting standard deviation over multiple days?

I am bad at statistics, but as I understand I have to get variations and counts for each days:

count=() variation=()
for i in $(range 4); do
    count[i]=$(logcli instant-query "count_over_time( $QUERY offset ${i}d")
    variation[i]=$(logcli instant-query "stdvar_over_time( $QUERY offset ${i} d")
done
declare -n standard_deviation= ?? how to calculate this ??

Thank you

tonyswumac · September 17, 2024, 6:27pm

How heavy are your logs?

In order to parallelize query in Loki you’d need to deploy either with simple scalable mode or distributed mode, see Loki deployment modes | Grafana Loki documentation
Then you’ll want to configure query frontend, and queriers to connect to query frontend, see Query frontend example | Grafana Loki documentation

After that you can scale the queriers up and down depending on your needs.

kamilcuk · September 18, 2024, 10:04am

Hi @tonyswumac hope you are well!

How heavy are your logs?

If I query all the logs line by line, it’s about 30MB of logs.

There are a lot of logs, the regex and there is also additional filtering, it is filtering from gigabytes of logs logs.

In order to parallelize query

I do not care about speed. I need to execute the query at all. Right now loki just closes the connection Query failed: run out of attempts while querying the server. I tried increasing query_timeout: 5m, but it is closing the connection after around 1 minute. If there anything else I can increase any other limits_config in loki?

scale the queriers up and down depending on your needs.

In my company, I have one machine for loki, and I will have only one machine. Because how Loki is constructed and how you recommend S3, from time to time I am contemplating running a local minio instance just so that loki works better with S3.

Right now, Loki uses a local file system, and it can use as much CPU as it wants to. I am running Memcached instances to use more memory - on normal operation Loki wants to use only like 2GB from 120GB on the machine, where memcached-chunk uses 20GB (still around 90GB free).

But CPU is not an issue - I/O is an issue and is the slowest, and I will not move from it, and with S3 I/O will be the same problem using the same disc.

Performance, however, is not an issue. I want the query to execute at all, I can wait 10 minutes for it.

tonyswumac · September 18, 2024, 9:53pm

This is not a use case for us, so I am kinda guessing a bit as well. Here are some things you can try:

Increase all timeout related settings, such as:

http_config.idle_conn_timeout
limits_config.query_timeout
server.http_server_read_timeout
server.http_server_read_header_timeout
server.http_server_idle_timeout

You can hit the /config endpoint and print out all configuration options, and search for timeout.

If you have a lot of CPUs, try tweaking parallelism so you have more queries running at the same time. For example:

limits_config:
  max_concurrent: 20
  split_queries_by_interval: 2h

Try to give it more concurrency, while increasing the time the query is split by, and see if you can reach a happy number.

Topic		Replies	Views
How do we get the standard deviation over time without labels? Grafana Loki	13	5586	March 27, 2024
Aggregation metrics step size Grafana Loki loki , query-help	1	690	September 8, 2023
Grafana Histogram with Loki datasource Grafana Loki loki	2	988	November 10, 2023
Wrong Calculation for Grafana / Loki dashboard Grafana Loki	1	314	December 19, 2023
Best Way to Retrieve Per-Day Logs in Loki with High Volume (1M+ Lines/Day) Grafana Loki api , loki , query-help , logs , logcli	5	381	February 13, 2025

How to calculate standard deviations when loki is slow?

Related topics