Query goes from instant success to timeout failure at arbitrary line-limit under max_entries_limit_per_query

cooperzuranskisep · November 5, 2024, 7:25pm

With a json parser attached to my query, I get success with 4677 lines in 100ms, but with 4678 as maxLines it fails. I have tried reading from multiple different log files to see if it is related to that. The magic number remains the same. Regardless of number of bytes read too.
However, If i remove the json parser from my query i can get up to a different magic number: 8188

Failure at 8189:

I have increased max_entries_limit_per_query to one million, which is why i can get over 5000.

My current setup is a docker compose network, using for local development. This is why you see all my logs appear at once. I can provide config files as needed.

We’re getting a little over 20k loglines an hour per file right now, and in general, want to write queries over much longer periods of time so this is a problem.

tonyswumac · November 5, 2024, 7:28pm

Please share your Loki configuration.

cooperzuranskisep · November 5, 2024, 7:29pm

Here it is. I know it is getting read. It’s 99% a copy of whatever the defaul was in the container but i added the limits_config section

# https://grafana.com/docs/loki/latest/configure/#limits_config
auth_enabled: false

server:
  http_listen_port: 3100

limits_config:
  max_entries_limit_per_query : 100000
  # max_query_series: 5000
  # ingestion_rate_mb: 10000
  # ingestion_burst_size_mb: 1000
  max_query_length: 0
  max_query_parallelism: 32

common:
  instance_addr: 127.0.0.1
  path_prefix: /loki
  storage:
    filesystem:
      chunks_directory: /loki/chunks
      rules_directory: /loki/rules
  replication_factor: 1
  ring:
    kvstore:
      store: inmemory

schema_config:
  configs:
    - from: 2020-10-24
      store: tsdb
      object_store: filesystem
      schema: v13
      index:
        prefix: index_
        period: 24h

ruler:
  alertmanager_url: http://localhost:9093

# By default, Loki will send anonymous, but uniquely-identifiable usage and configuration
# analytics to Grafana Labs. These statistics are sent to https://stats.grafana.org/
#
# Statistics help us better understand how Loki is used, and they show us performance
# levels for most users. This helps us prioritize features and documentation.
# For more information on what's sent, look at
# https://github.com/grafana/loki/blob/main/pkg/usagestats/stats.go
# Refer to the buildReport method to see what goes into a report.
#
# If you would like to disable reporting, uncomment the following lines:
#analytics:
#  reporting_enabled: false

tonyswumac · November 5, 2024, 9:52pm

Try adjusting msg size:

server:
  # 100MB
  grpc_server_max_recv_msg_size: 1.048576e+08
  grpc_server_max_send_msg_size: 1.048576e+08

Check your container metrics and see if you observe CPU or memory pressure.
Check logs and see if your Loki container is producing any error log.

cooperzuranskisep · November 5, 2024, 10:28pm

#1 worked. Do you know why? Before, i didnt see any memory pressure or errors in the log other than the same one from above.

tonyswumac · November 6, 2024, 12:02am

It’s essentially a configuration for the maximum message size when loki components communicate with each other.

cooperzuranskisep · November 6, 2024, 3:06pm

I now run into a different issue when getting large amounts of data. I up my limit to _per_query to 9,999,999, but It seems the message size eventually taps out. Can i just keep increasing the max message size and timeout to any large number, or is what you gave me the magic maximum number? Is there a better way around this problem like multiple messages or something?

tonyswumac · November 7, 2024, 7:10pm

How large is large?

If you are querying a lot of data, you should consider scaling Loki to multiple containers / instances.

cooperzuranskisep · November 18, 2024, 9:59pm

We are more concerned with trends and past occurrences, meaning many months.

We ultimately decided to move to having Nlog insert directly to postgres, and have grafana read that instead. With promtail/loki we couldn’t get enough granularity in the database with things mostly being a log line and timestamp, so queries going back that far had to return a ton of results so that more filtering could be done in grafana.

I know we are moving a bit away from the live-look observability that these things were intended for. It’s just easier to write semi-elegant and/or aggregation queries in sql than logql. But we still like the alerting and visualizations of grafana.

(this is mostly a note for future devs running into these problems)

system · November 18, 2025, 10:00pm

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Always issues with querying Loki data via Grafana Grafana Loki loki	2	1467	February 11, 2025
Loki querier not scaling Grafana Loki	8	746	July 1, 2025
Logging a million lines per minute Grafana Loki	3	3131	February 20, 2023
Grafana Loki HTTP API: fails if I try to fetch more than 5,000 records Grafana Loki	2	3443	January 15, 2025
New issue caused by tweaking limits_config max_query_length Grafana Loki loki	2	1027	December 19, 2025

Query goes from instant success to timeout failure at arbitrary line-limit under max_entries_limit_per_query

Related topics