With a json parser attached to my query, I get success with 4677 lines in 100ms, but with 4678 as maxLines it fails. I have tried reading from multiple different log files to see if it is related to that. The magic number remains the same. Regardless of number of bytes read too.
However, If i remove the json parser from my query i can get up to a different magic number: 8188
Failure at 8189:
I have increased max_entries_limit_per_query to one million, which is why i can get over 5000.
My current setup is a docker compose network, using for local development. This is why you see all my logs appear at once. I can provide config files as needed.
We’re getting a little over 20k loglines an hour per file right now, and in general, want to write queries over much longer periods of time so this is a problem.
Please share your Loki configuration.
Here it is. I know it is getting read. It’s 99% a copy of whatever the defaul was in the container but i added the limits_config section
# https://grafana.com/docs/loki/latest/configure/#limits_config
auth_enabled: false
server:
http_listen_port: 3100
limits_config:
max_entries_limit_per_query : 100000
# max_query_series: 5000
# ingestion_rate_mb: 10000
# ingestion_burst_size_mb: 1000
max_query_length: 0
max_query_parallelism: 32
common:
instance_addr: 127.0.0.1
path_prefix: /loki
storage:
filesystem:
chunks_directory: /loki/chunks
rules_directory: /loki/rules
replication_factor: 1
ring:
kvstore:
store: inmemory
schema_config:
configs:
- from: 2020-10-24
store: tsdb
object_store: filesystem
schema: v13
index:
prefix: index_
period: 24h
ruler:
alertmanager_url: http://localhost:9093
# By default, Loki will send anonymous, but uniquely-identifiable usage and configuration
# analytics to Grafana Labs. These statistics are sent to https://stats.grafana.org/
#
# Statistics help us better understand how Loki is used, and they show us performance
# levels for most users. This helps us prioritize features and documentation.
# For more information on what's sent, look at
# https://github.com/grafana/loki/blob/main/pkg/usagestats/stats.go
# Refer to the buildReport method to see what goes into a report.
#
# If you would like to disable reporting, uncomment the following lines:
#analytics:
# reporting_enabled: false
- Try adjusting msg size:
server:
# 100MB
grpc_server_max_recv_msg_size: 1.048576e+08
grpc_server_max_send_msg_size: 1.048576e+08
-
Check your container metrics and see if you observe CPU or memory pressure.
-
Check logs and see if your Loki container is producing any error log.
#1 worked. Do you know why? Before, i didnt see any memory pressure or errors in the log other than the same one from above.
It’s essentially a configuration for the maximum message size when loki components communicate with each other.
I now run into a different issue when getting large amounts of data. I up my limit to _per_query to 9,999,999, but It seems the message size eventually taps out. Can i just keep increasing the max message size and timeout to any large number, or is what you gave me the magic maximum number? Is there a better way around this problem like multiple messages or something?
How large is large?
If you are querying a lot of data, you should consider scaling Loki to multiple containers / instances.
We are more concerned with trends and past occurrences, meaning many months.
We ultimately decided to move to having Nlog insert directly to postgres, and have grafana read that instead. With promtail/loki we couldn’t get enough granularity in the database with things mostly being a log line and timestamp, so queries going back that far had to return a ton of results so that more filtering could be done in grafana.
I know we are moving a bit away from the live-look observability that these things were intended for. It’s just easier to write semi-elegant and/or aggregation queries in sql than logql. But we still like the alerting and visualizations of grafana.
(this is mostly a note for future devs running into these problems)