Grafana browser query timeout

Hi,

Recently I upgraded Grafana from 9.3.XX (can’t remember) to 9.5.1. After doing this I’m seeing repeatable panel timeouts on our dashboard machine.

This dashboard display machine is an older device running Lubuntu 22.04.02 LTS using chromium-browser v113.0.5672.126 to display 2 dashboards on 2 monitors that refresh every second. That’s its sole purpose and seems to be running well otherwise.

Other newer machines are also exhibiting the same issues and it makes no difference if the clients are wired or wiif.

The server is Ubuntu 22.04.02 LTS running Grafana 9.5.3 (updated to see if any fixed helped) and Influxdb 2.7.1 data sources.

I see the following errors in the logs:

Although the dashboard refresh is very frequent they seem to be getting completed quickly:

Jun 13 12:11:33 SERVERNAME grafana[31534]: logger=context userId=0 orgId=1 uname= t=2023-06-13T12:11:33.858580458Z level=info msg=“Request Completed” method=POST path=/api/ds/query status=400 remote_addr=IPADDRESS time_ms=4 duration=4.769373ms size=432 referer=“http://SERVERNAME:3000/d/WurYZcRgt/DASHBOARDNAME?kiosk=tv&orgId=1&refresh=1s” handler=/api/ds/query

Refreshing the web browser on the client, or restarting the grafana-server service fixes everything for a while, but not long.

Is there a client timeout setting that I can play with to see if it helps?
Is there a way to see if a specific panel on the dashboard is causing an issue?
Anything else I can try?
Is there a way to downgrade?

I would start with the query or queries

Can you post them here?
Also do you have tags associated with your measurements?
Can you also post the schema of your buckets?

There’s 10 panels on this specific dashboard, sanitised information below:

from(bucket: v.bucket)
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "GasDetection")
  |> filter(fn: (r) => r["Gas"] == "ProcessGas")
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: true)
from(bucket: v.bucket)
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "Thermocouple")
  |> filter(fn: (r) => r["Sensor"] == "01" or r["Sensor"] == "04" or r["Sensor"] == "07")
  |> filter(fn: (r) => r["_field"] == "value")
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: true)
from(bucket: v.bucket)
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "Thermocouple")
  |> filter(fn: (r) => r["Sensor"] == "03" or r["Sensor"] == "06")
  |> filter(fn: (r) => r["_field"] == "value")  
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: true)
from(bucket: v.bucket)
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "Breather")
  |> filter(fn: (r) => r["_field"] == "value")  
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)
from(bucket: v.bucket)
  |> range(start: -3s, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "GasDetectionState")
  |> filter(fn: (r) => r["_field"] == "value")
  |> last()
from(bucket: v.bucket)
  |> range(start: -5s, stop: v.timeRangeStop)
  |> filter(fn: (r) => r._measurement == "Thermocouple")
  |> distinct(column: "Sensor")
  |> group()
  |> count()
from(bucket: v.bucket)
  |> range(start: -3s, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["Metric"] == "Alarm_info")
  |> last()
from(bucket: v.bucket)
  |> range(start: -3s, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "GasDetection")
  |> filter(fn: (r) => r["Gas"] == "ProcessGas")
  |> last()
from(bucket: v.bucket)
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["Metric"] == "fMeasure")
  |> filter(fn: (r) => r["_field"] == "value")
from(bucket: v.bucket)
  |> range(start: -3s, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["Metric"] == "Temperature")
  |> last()

Those are extracted from working from top left to bottom left, then top right to bottom right on this dashboard:

I don’t know how to check the tags or list the schema yet so will have to look into that and reply back.

1 Like

As far as tags this is a very important read

from above ^

InfluxDB lets you specify fields and tags, both being key/value pairs where the difference is that tags are automatically indexed.

Because fields are not being indexed at all, on every query where InfluxDB is asked to find a specified field, it needs to sequentially scan every value of the field column

@yosiasz Slowly working through your questions…I don’t know of a better way of doing this, but here’s the buckets/measurements and tags in use:

Does that help?

1 Like

A quick update, I’ve just upgraded to v10 and these problems are better, but still not resolved…

Hi @olliecampbell

I just checked the GitHub issue you referenced and saw that the developer mentioned that the complete issue will be fixed in ver 10.0.1 (ETA 22 June)

I’ve installed the latest update and this is still an issue, panel timeouts that gradually spread across a dashboard.

I see.

In that case please reopen the bug.

Also not sure if you made a typo mistake on the GitHub issue page as it read as:

I can confirm v10.0.1 stops this error appearing.

Then please re-edit your comment so that the developer can also able to know about it.

Hi Usman,

The github page is correct, I no longer see the errors in the log, but this is a separate issue that I originally thought might be linked.

This browser timeout is something different.

I will raise a bug request on github.

1 Like

I have recently updated the dashboard computer so that it’s running Chromium v114.0.5735.198 and this looks to have improved. I’ll report back in a day or two.

1 Like