Queries Process to many Bytes

I used Logcli to run a 1 hour metric query, Logcli --stats output stats the query proccessed 5.0GB of uncompressed bytes.

Store.DecompressedBytes: 5.0GB

This doesn’t make sense, my cluster ingests as a whole around 300KB per second, so even if the query pulls every single chunk ingested in the 1 hour time range i’m querying the query would only need to process 1.08GB not to mention i’m filtering the logs by the filename label.

Creating Grafana panels of Loki metric queries is unusable without caching, each query takes forever, I think a good place to start tuning this is understanding why each query processes so much data.

I’m using boltdb-shipper as the index store and s3 as the chunk store. I thought that there is some problem with how index files are saved, that they might be pointing to larger than desired sets of data so I added a compactor instance in an attempt to prevent duplicate chunks from being loaded by queries. This indeed helped and the same query reported to process 3.6GB of uncompressed bytes.

Store.DecompressedBytes: 3.6GB

Any idea on how exactly chunks are loaded by queriers? Or general thoughts on how to decrease queries time.

We are running Loki on vm’s, 2 query frontends, 3 queriers (with 12 cpu’s each), 4 ingesters, 2 distributors, 1 compactor.

auth_enabled: false

server:
  graceful_shutdown_timeout: 5s
  grpc_server_max_concurrent_streams: 1000
  grpc_server_max_recv_msg_size: 104857600
  grpc_server_max_send_msg_size: 104857600
  http_listen_port: 3100,
  http_server_idle_timeout: 120s,
  http_server_write_timeout: 1m 

distributor:
  ring:
    kvstore:
      store: memberlist

ingester:
  chunk_encoding: snappy
  chunk_block_size: 262144
  chunk_target_size: 4000000
  chunk_idle_period: 15m
  lifecycler:
    heartbeat_period: 5s
    join_after: 30s
    num_tokens: 512
    ring:
      heartbeat_timeout: 1m
      kvstore:
        store: memberlist
      replication_factor: 3
    final_sleep: 0s
  max_transfer_retries: 60

ingester_client:
  grpc_client_config:
    max_recv_msg_size: 67108864
  remote_timeout: 1s

frontend:
  compress_responses: true
  log_queries_longer_than: 5s
  max_outstanding_per_tenant: 1024

frontend_worker:
  frontend_address: frontend:9096
  grpc_client_config:
    max_send_msg_size: 104857600
  parallelism: 12

limits_config:
  enforce_metric_name: false
  ingestion_burst_size_mb: 10
  ingestion_rate_mb: 5
  ingestion_rate_strategy: local
  max_cache_freshness_per_query: 10m
  max_global_streams_per_user: 10000
  max_query_length: 12000h
  max_query_parallelism: 256
  max_streams_per_user: 0
  reject_old_samples: true
  reject_old_samples_max_age: 168h

querier:
  query_ingesters_within: 2h

query_range:
  align_queries_with_step: true
  max_retries: 5 
  parallelise_shardable_queries: true
  split_queries_by_interval: 30m

schema_config:
  configs:
    - from: 2020-06-10
      store: boltdb-shipper
      object_store: s3
      schema: v11
      index:
        prefix: test_
        period: 24h

compactor:
  working_directory: /opt/loki/compactor
  shared_store: s3

storage_config:
  aws:
    bucketnames: bucket_names
    endpoint: endpoint
    region: region
    access_key_id: mysecret_key_id
    secret_access_key: mysecret_access_key
    http_config:
      idle_conn_timeout: 90s
      response_header_timeout: 0s
      insecure_skip_verify: true
    s3forcepathstyle: true
  boltdb-shipper:
    active_index_directory /opt/loki/boltdb-shipper-active
    cache_location /opt/loki/boltdb-shipper-cache
    shared_store: s3
 
memberlist:
   abort_if_cluster_join_fails: false
   bind_addr:
     - the_bind_ip_address
   bind_port: 7946
   join_members:
     - ip_address:7946
     - off_all_the_loki_components:7946 

Hi There,

I think I am on the same boat for now. As you mentioned, you are running the loki cluster in vm, is that in local build or on any K8S/dockerized env?

1 Like

We are running distributed Loki directly on the vm (exceuting a binary).

Solving the slow query times was done in a few steps.

  1. We were saving logs in an on premise object storage that uses S3. However we were getting around 30mb/s of download speed (this is just the speed our hardware can give) this speed isn’t good enough to run metric queries fast so we switched to saving data in a Cassandra cluster we deployed. Cassandra saves its data on disk which gave us much better performance.

  2. This step correlates directly with the question I posted. Loki build up log data in memory inside chunks, each chunk represents a log stream (a specific combination of lables). At query time Loki pulls the entire log stream leading to it processing more data than relevant to the query. The solution was to use regex in Promtail to extract fields at send time, leading to a more specific log stream (less data processed) and a decrease in processing work that needs to be done at query time. The reason we didn’t do this at the start is that Loki recomends to yse static labels as much as possible since extracting a field like an ip address can build up a lot of log streams in memory, leading to a big index and cluster wide long query times. However in our use case, extracting very specific fields on very specific targets didn’t have a negative affect.

If you are a heavy dashboard user and is looking to use Loki mostly for this feature I would recommend using the ELK stack instead. Loki can’t compete, at least from my experience, with ELK’s speed when it comes to metric queries.
The reason we went with Loki is that we are already heavily invested in prometheus and as a result most of our observabillity dashboards use metrics from prometheus, while Loki will mostly be used in the explore tab in grafana.

Hope This helps.

Thank you so much for the details.

I have some confusions if you do not mind checking them out:

  1. Per the 2nd point in your answer, does that mean the more labels created in the config file the more chunks will be created in the stores since the combinations are getting bigger?
  2. I am running the local build as well in the VM by following https://github.com/grafana/loki/tree/main/production/docker, the querier, ingester and distributor are running in the same instance of loki on 1 vm, but how do you configure the number of each of them like 3 queriers, 4 ingesters (replica factors?) and 2 distributors?
  3. Well, we are still in the research stage, ELK will be the alternative if Loki is not that as expected, thanks for this suggestions.
1 Like
  1. Yes, exactly! Logs from a single host forwarding 2 files to Loki under 1 job name will be sent to 2 chunks in the ingesters {job=“job_name”, instance=“instance_name”, filename=“filename1”} and {job=“job_name”, instance=“instance_name”, filename=“filename2”}.
    We were collecting logs from an http load-balancer, the relevant vms sending logs only had the following labels: job, host and filename (static labels). Due to poor performance we extracted using a regex in promtail the http_method field, this had 2 affects. We were able to query logs with a specific http_method which reduced the amount of logs processed by a query (we queried only get requests or only post requests). However the more obvious benefit to this extraction is the fact that we didn’t have to extract the field at query time which decreased query execution time.
    It is important to keep in mind that more chunks equls to a larger index, which leads to slower cluster wide query times so you have to be careful with what fields you are extracting and how often you do so. Please refer to this article which explains this better than myslef: Best practices | Grafana Labs

  2. I think i misunderstood what you meant when you said local build. We have a single vm for every Loki component: 3 ingester vms, 3 querier vms, 2 query-frontend vms, 2 distributors vms and 1 table-manager vm. We don’t have access to a k8s cluster to deploy loki. On every vm we run a linux service which executes the Loki binary with the proper target flag.

  3. I wouldn’t recommend using Loki without Prometheus I think the main benefit of Loki for us was the seamless integration with the Prometheus eco-system. Prometheus is an amazing piece of software and I would definitely recommend checking it out if you haven’t already.

1 Like

Thanks.

Regarding 2nd point, i deployed a new env with 2 query-frontend vms as well. But i got one question, which is the 2 frontends are running on differnet vms and how to set the frontend_address in the config file?

1 Like

This was actually a choke point for us.

Since we aren’t using k8s we had to deploy another vm that acts as a load balancer (and a second one for failover), we use haproxy load balancer.

We use 1 load balancer to distribute traffic from promtail to the distributors and to the query frontends. You can configure 1 haproxy to distribute traffic to 2 backend groups according to the url that was used.

Whild we use round robin as the load balancing method to our distributors, we distribute traffic to our frontends in an active passive method (only 1 is active). While no part of the documentation I could find supports this so we might be wrong, we found out that a single query uses only halth of the queriers available cores (round robin between 2 frontends) also the connected workers metric to each frontend showed halth of the cluster querier cores are connected to each frontend. Since we have a fairly small deployment we decided we want to use every core for every query and we went the active passive way. I guess the case is different on larger deployments.

1 Like

Thanks for the reply.

I tried the loadbalancing with nginx for the 2 frontend running on 2 VMs. I did not use downstream_url but the frontend_worker since there is a discussion in another topic that this way could bring some more parallesium.
However, the configuration is not that simple as expected. As you know that the fronend_worker shoud point to the grpc port of the frontend, then how about the port 3100 for http queries?

I posted this out in another topic https://community.grafana.com/t/how-to-configure-nginx-for-frontend-worker/52306, your suggestion is highly welcomed.

Btw, I set a memberlist for ingesters and distributors and configure the nginx for them as well, that looks fine for now.

1 Like