The short version: my logs query usage has been climbing steadily for about 40 hours and I can’t figure out what’s actually running the queries. grafanacloud_logs_instance_query_overage has been rising at a very consistent ~7.5/hr since roughly June 20, 22:00 UTC, and it’s now around 312 — which is strange, because I only ingest about 59 MiB of logs total. So it’s almost entirely query volume, not data.
What’s got me stuck is that I can’t find the queries. In the usage-insights request log ({instance_id=“1611752”} |= “request timings”), the only thing hitting the instance since about June 22, 03:47 UTC is my three alert rules doing quick instant queries every 60s. (Before that there were some query_range queries under my own username, but they stop in the log at 03:47 — and the usage kept right on climbing afterward anyway, so they don’t seem to explain it.) Meanwhile grafanacloud_logs_instance_query_bytes:rate5m on the querier is still sitting around 700–800M right now. So a couple of metrics say something’s scanning a lot, but I can’t find any matching query in the log — and it’s been climbing all night while my laptop was asleep.
I did look into the Loki Query Fair Usage dashboard, but nothing really jumped out. The top “bytes-by-source” result was “grafana-alerts”, but everything was showing in the 10s of MiB, so that doesn’t seem like it could be the thing.
I’ve gone through everything I can think of on my end — revoked all my sessions, checked my access-policy tokens (the only one is write-only, so it can’t query), no service accounts of mine, no dashboards or recording rules pointing at the logs — and I’m honestly out of ideas. Any ideas for how to continue to triage this would be very helpful!