6.1.4: Error messages with alert handling / error="Could not find datasource database is locked"

mikkop71 · April 25, 2019, 6:30am

I have started to get tons of error message related to alert handling:

> t=2019-04-25T06:22:12+0000 lvl=eror msg="Alert Rule Result Error" logger=alerting.evalContext ruleId=119 name="Disk Space Alert () " error="Could not find datasource database is locked" changing state to=alerting
> t=2019-04-25T06:22:12+0000 lvl=eror msg="Alert Rule Result Error" logger=alerting.evalContext ruleId=118 name="Disk Space Alert () " error="Could not find datasource database is locked" changing state to=alerting
> t=2019-04-25T06:22:12+0000 lvl=info msg="New state change" logger=alerting.resultHandler alertId=118 newState=alerting prev state=ok
> t=2019-04-25T06:22:12+0000 lvl=info msg="New state change" logger=alerting.resultHandler alertId=119 newState=alerting prev state=ok
> t=2019-04-25T06:22:12+0000 lvl=eror msg="Alert Rule Result Error" logger=alerting.evalContext ruleId=120 name="Disk Space Alert () " error="Could not find datasource database is locked" changing state to=alerting
> t=2019-04-25T06:22:12+0000 lvl=info msg="New state change" logger=alerting.resultHandler alertId=120 newState=alerting prev state=ok
> t=2019-04-25T06:22:13+0000 lvl=eror msg="Alert Rule Result Error" logger=alerting.evalContext ruleId=49 name="Disk Space Alert () " error="Could not find datasource database is locked" changing state to=alerting
> t=2019-04-25T06:22:13+0000 lvl=info msg="New state change" logger=alerting.resultHandler alertId=49 newState=alerting prev state=ok
> t=2019-04-25T06:22:13+0000 lvl=eror msg="Alert Rule Result Error" logger=alerting.evalContext ruleId=47 name="Disk Space Alert () " error="Could not find datasource database is locked" changing state to=alerting
> t=2019-04-25T06:22:13+0000 lvl=info msg="New state change" logger=alerting.resultHandler alertId=47 newState=alerting prev state=ok
> t=2019-04-25T06:22:13+0000 lvl=eror msg="Alert Rule Result Error" logger=alerting.evalContext ruleId=25 name="insitecluster-nodes Alert" error="Could not find datasource database is locked" changing state to=alerting
> t=2019-04-25T06:22:13+0000 lvl=eror msg="Alert Rule Result Error" logger=alerting.evalContext ruleId=26 name="patientexplorer-nodes Alert" error="Could not find datasource database is locked" changing state to=alerting
> t=2019-04-25T06:22:13+0000 lvl=info msg="New state change" logger=alerting.resultHandler alertId=26 newState=alerting prev state=ok
> t=2019-04-25T06:22:13+0000 lvl=eror msg="Alert Rule Result Error" logger=alerting.evalContext ruleId=24 name="multilabdl-nodes Alert" error="Could not find datasource database is locked" changing state to=alerting
> t=2019-04-25T06:22:13+0000 lvl=info msg="New state change" logger=alerting.resultHandler alertId=24 newState=alerting prev state=ok
> t=2019-04-25T06:22:13+0000 lvl=eror msg="Alert Rule Result Error" logger=alerting.evalContext ruleId=48 name="Disk Space Alert () " error="Could not find datasource database is locked" changing state to=alerting

I am not absolutelyq sure, but I think that problem started when switching to use version 6 series Grafana.

Any ideas?

daniellee · April 25, 2019, 8:56am

Looking at your logs - it looks like you are running out of disk space:

t=2019-04-25T06:22:13+0000 lvl=eror msg=“Alert Rule Result Error” logger=alerting.evalContext ruleId=48 name="Disk Space Alert () " error=“Could not find datasource database is locked” changing state to=alerting

If you have enough disk space then looks like it be related to your file system. (See here for an example of file system problems with Sqlite).

mikkop71 · April 25, 2019, 9:57am

Yes systems which I am monitoring is running out of disk space, but not Grafana itself. It is getting these: “Could not find datasource database is locked” error messages. These message are in some cases visible also in Alert List visualization recent changes list. They are not visible in current status list at all.

daniellee · April 26, 2019, 4:43pm

What file system are you running on? Sqlite is a file-based database so it does not work on all types of file systems (see my previous reply for an example).

If you have a lot of traffic and are doing a lot of writes to the database then maybe you have reached the limit for sqlite and it is time to switch to MySql or Postgres. But this is unlikely unless you have a very large number of alerts or users.

mikkop71 · April 29, 2019, 6:30am

Grafana is run on xfs and database is located on ext4.
How do you define lot of trafic?
Only two users and around 200 alerts so to me it doesn’t sound too much.

daniellee · May 6, 2019, 3:46pm

No, that is not a lot of traffic and xfs and ext4 are standard file systems. Looking at the error messages I’m not sure if they are sqlite errors. Which datasource is returning the errors?

Can you try turning on debug mode for logging your datasource and alerting: Configure Alerting | Grafana documentation

In this issue, the problem was that the datasource id had been changed:

github.com/grafana/grafana

Intermittent alert "Could not find datasource Data source not found"

opened 05:35PM - 08 Feb 18 UTC

closed 08:53AM - 15 Feb 18 UTC

retzkek

type/bug area/alerting

- What Grafana version are you using? 5.0.0-beta1 - What datasource are you usi…ng? Prometheus - What OS are you running grafana on? CentOS 7 I upgraded our test server to v5 beta yesterday, and since then I've been getting intermittent alerts triggered with the error "Data source not found." The alert then clears on the next run. ```t=2018-02-08T16:39:24+0000 lvl=eror msg="Alert Rule Result Error" logger=alerting.evalHandler ruleId=15 name="RabbitMQ Queue alert" error="Could not find datasource Data source not found" changing state to=alerting``` I've seen this for multiple alerts, but not all, and not at the same time. All the alerts use the same Prometheus data source.

acdha · May 30, 2019, 2:42pm

A recent commit added a workaround which should make Grafana a little more tolerant of database contention: https://github.com/grafana/grafana/commit/5884e235fcf8cdbb4c42a94bdafe19881832bc54

I noticed this on a server with a lot of contention — it’s definitely gotten worse in the 6.x series — between Prometheus and Grafana for a particular storage partition so making sure it’s on a dedicated partition should help.

lafrech · January 11, 2022, 2:30pm

I have a Grafana 8.3.3 setup using default sqllite config.

I run about 150 alert rules.

I’m getting a lot of those errors in the alerts panel:

could not find datasource: database is locked

Looking at the logs, I also found a lot of

msg="failed to fetch alert rule" err="database is locked"

even

msg="failed to save alert state" err="database is locked"

The system is definitely not busy. It runs in a VM on a single ext4 partition with 35 Go free space, 7 Go RAM with 65% free, two cores, almost idle right now.

The alert rules are scheduled with a daily interval but I suspect they are all triggered at the same time, so that could be 150 threads trying to access the DB at the same time.

Could that be the cause?

Does this mean I’m already out of scale and I should move to another DB?

(Edit: just found Grafana Logs "database is locked" · Issue #16638 · grafana/grafana · GitHub. I’ll follow there.)

Topic		Replies	Views
Could not find datasource: database is locked Grafana alerting , datasource	2	1322	June 7, 2022
Error in Grafana Alerting - Failed getting data source Alerting alerting , datasource	0	494	July 31, 2023
Datasource error alert Alerting alerting , datasource , grafana	3	3138	July 31, 2024
I'm getting an error in the log when I started Grafana server Grafana	0	309	February 2, 2019
Error "database is locked" on alter instance Alerting alerting	3	2673	July 5, 2022

6.1.4: Error messages with alert handling / error="Could not find datasource database is locked"

Related topics