Failed to load log volume - "failed mapping AST" messages?

diranged · July 18, 2024, 6:41pm

We run a large Loki environment - collecting ~40TB of logs per day, and often querying 24-hrs of logs at a time which can return a log volume graph that reports billions of messages. Obviously this is time consuming and expensive, so we have our environment setup to scale up and down dynamically when users are making these queries.

One of our biggest pain points is when the Grafana “Log Volume” graph simply doesn’t load … or loads up a tiny sliver of data, but doesn’t fully populate. There is no feedback to the user as to why this happens, and it makes them trust the system less.

In a recent situation where this was occuring, I dug through the logs and found that our query-frontend pods were reporting:

ts=2024-07-18T18:34:50.379647467Z caller=spanlogger.go:109 middleware=QueryShard.astMapperware level=warn msg="failed mapping AST" err="context canceled" query="sum by (level) (count_over_time({k8s_namespace_ │ name=\"istio-gateways\"} | drop __error__[1m]))"

I don’t really get any other data though… what can cause these to fail like this? What component would be canceling the context at that point? Could it be Grafana itself?

tonyswumac · July 18, 2024, 8:39pm

I don’t really have a lot of experiences operating big Loki cluster (we are probably at 2TB logs per day). I’ve found that if you scale up readers when request come in it’s often too late, so we try to estimate our day-time (business hour) usage and scale to maybe 80% of that, and then scale down during off business hours for better user experiences.

You might consider also posting your question in the slack channel, there are many people there with more experiences than I.

Topic		Replies	Views
Failed to load log volume for this query Grafana Loki	1	1099	August 28, 2024
Multiple Issues in Loki: Plugin Errors, Label Loading Failures, and Log Volume Not Loading Grafana Loki loki , plugins , datasource , promtail	1	125	January 15, 2025
Failed to load log volume for this query - Histogram not loading for long duration query Grafana loki , query-help , grafana-ui , grafana	0	150	November 14, 2024
My logs load but my histogram says " No logs Volume available" Grafana Loki loki	9	1085	October 23, 2024
Loki Queriers Give Error Fetching Chunks, Context Canceled on Grafana Queries Grafana Loki	1	1380	October 5, 2022

Failed to load log volume - "failed mapping AST" messages?

Related topics