Hello,
we have recently upgraded to Grafan 9.3 and now to 9.5.2.
In both versions, 9.3.x and 9.5.2 we see the same behaviour:
- We repeatedly receive alerts containing [no value] and grafana_state_reason Error
- We are provisioning all dashboards and alerts using file-configuration, for all alerts we changed the executionErrorState: OK (see snippet below)
- But we still regularly receive the no-value alerts.
Alert snippet
dashboardUid: 9jJFBRvmz
panelId: 86
noDataState: OK
executionErrorState: OK
annotations:
...
The database is a hosted prometheus and for configuration, we use a hosted PostgreSQL database.
But We expected that when “executionErrorState: OK” is set, these timeouts should be ignored.
But as is, we receive a flood of alerts which contain the no value messages (see screenshot)
Since we had issues with the SQLite on the Kubernetes Persistent Volume, we also switched Grafana to a hosted PostgreSQL Database for configuration.
It seems, that not the Query to the Prometheus Database, but instead the Query to update the Alert state failed:
a little bit later we see Detected stale state entry
But nonetheless, this should not provoke an alert?!
How can I fix this, that a failed query to PostgreSQL does not trigger alerts?