Error "database is locked" on alter instance

Hi,

I am using Grafana 8.0.3 (upgrade from 7.x in last week), deploy by official Docker image with external data volume in EXT4 filesystem.

I created alters (ngalert) for our system, everything looks fine in the beginning, then I start get alert by error report, as below.

The number is the create sequence of the rules, I start get the “database locked” error from the 8th rules. After that, any new rule I created always get this error after running a while.

And actually, the rules still working properly. I mean, I still can see the state changed and receive the alters, if I setup the state to OK when execution error or timeout, the rule still working.

t=2021-07-05T11:55:25+0800 lvl=eror msg="failed to fetch lert rule" logger=ngalert key="{orgID: 1, UID: 3YW4mjz7k}"
t=2021-07-05T11:55:26+0800 lvl=eror msg="failed to fetch alert rule" logger=ngalert key="{orgID: 1, UID: ESFGWCknk}"
t=2021-07-05T11:55:27+0800 lvl=info msg="Request Completed" logger=context userId=1 =1 uname=admin method=GET path=/api/live/ws status=400 remote_addr=1.163.110.115 time_ms=7065 size=12 referer=
t=2021-07-05T11:55:29+0800 lvl=eror msg="Anonymous access organization error: 'company': database is locked"
t=2021-07-05T11:55:29+0800 lvl=info msg="Request Completed" logger=context userId=0 orgId=0 uname= method=GET path=/api/live/ws status=401 remote_addr=220.133.186.239 time_ms=5005 size=26 referer=
t=2021-07-05T11:55:32+0800 lvl=eror msg="Failed to look up user based on cookie" logger=context error="database is locked"
t=2021-07-05T11:55:34+0800 lvl=eror msg="Anonymous access organization error: 'company': database is locked"
t=2021-07-05T11:55:34+0800 lvl=info msg="Request Completed" logger=context userId=0 orgId=0 uname= method=GET path=/api/live/ws status=401 remote_addr=220.133.186.239 time_ms=5005 size=26 referer=
t=2021-07-05T11:55:34+0800 lvl=info msg="Request Completed" logger=context userId=0 orgId=1 uname= method=GET path=/api/live/ws status=400 remote_addr=220.133.186.239 time_ms=830 size=12 referer=
t=2021-07-05T11:55:35+0800 lvl=info msg="Request Completed" logger=context userId=0 orgId=1 uname= method=GET path=/api/live/ws status=400 remote_addr=220.133.186.239 time_ms=0 size=12 referer=

From the log, you can see some “failed to fetch alert rule” errors, too. When I using the grafana UI, it did show some errors sometimes, but normally work fine after a reload.

Seems it cause by heavy loading of my HDD. I put the grafana data volume and the database on the same HDD (data drive) in my original design. After I move the grafana data volume to other drive, no ‘database locked’ error happen anymore.

1 Like