I want to calculate standard deviations of some csv data stored in loki:
logcli instant-query '
stddev_over_time(
{job=....}
| regexp `^(?P<hhm>09:[345]|1[0-5])[^,]*,MinMaxAvg,(?P<category>[^,]{4})[^,]*,(?P<name>[^,]+),(?P<count>[^,]+)'
| unwrap count [4d] ) by (job, category, name)
)
'
However, for much data it just fails with EOF:
....+unwrap+avg+%5B4d%5D+%29+by+%28name%2C+hour%29%0A++++++++&time=1726569532005034620": EOF
2024/09/17 06:39:35 Query failed: run out of attempts while querying the server
How can I “parallelize” getting standard deviation over multiple days?
I am bad at statistics, but as I understand I have to get variations and counts for each days:
count=() variation=()
for i in $(range 4); do
count[i]=$(logcli instant-query "count_over_time( $QUERY offset ${i}d")
variation[i]=$(logcli instant-query "stdvar_over_time( $QUERY offset ${i} d")
done
declare -n standard_deviation= ?? how to calculate this ??
Thank you
How heavy are your logs?
- In order to parallelize query in Loki you’d need to deploy either with simple scalable mode or distributed mode, see Loki deployment modes | Grafana Loki documentation
- Then you’ll want to configure query frontend, and queriers to connect to query frontend, see Query frontend example | Grafana Loki documentation
After that you can scale the queriers up and down depending on your needs.
1 Like
Hi @tonyswumac hope you are well!
How heavy are your logs?
If I query all the logs line by line, it’s about 30MB of logs.
There are a lot of logs, the regex and there is also additional filtering, it is filtering from gigabytes of logs logs.
In order to parallelize query
I do not care about speed. I need to execute the query at all. Right now loki just closes the connection Query failed: run out of attempts while querying the server
. I tried increasing query_timeout: 5m
, but it is closing the connection after around 1 minute. If there anything else I can increase any other limits_config
in loki?
scale the queriers up and down depending on your needs.
In my company, I have one machine for loki, and I will have only one machine. Because how Loki is constructed and how you recommend S3, from time to time I am contemplating running a local minio instance just so that loki works better with S3.
Right now, Loki uses a local file system, and it can use as much CPU as it wants to. I am running Memcached instances to use more memory - on normal operation Loki wants to use only like 2GB from 120GB on the machine, where memcached-chunk
uses 20GB (still around 90GB free).
But CPU is not an issue - I/O is an issue and is the slowest, and I will not move from it, and with S3 I/O will be the same problem using the same disc.
Performance, however, is not an issue. I want the query to execute at all, I can wait 10 minutes for it.
This is not a use case for us, so I am kinda guessing a bit as well. Here are some things you can try:
- Increase all timeout related settings, such as:
http_config.idle_conn_timeout
limits_config.query_timeout
server.http_server_read_timeout
server.http_server_read_header_timeout
server.http_server_idle_timeout
You can hit the /config
endpoint and print out all configuration options, and search for timeout.
- If you have a lot of CPUs, try tweaking parallelism so you have more queries running at the same time. For example:
limits_config:
max_concurrent: 20
split_queries_by_interval: 2h
Try to give it more concurrency, while increasing the time the query is split by, and see if you can reach a happy number.
1 Like