Server no starting - could not load SSL certificate

Hi,

my Grafana installation was running fine - till today :frowning_face:

  • What Grafana version and what operating system are you using?
    Grafana version 11.0.0 / Ubuntu 22.04.4 LTS

Today I restarted the server after doing some updates. To be honest I’m not sure if Grafana received an update, too.
After the reboot I noticed that Grafana was not running.
If I try to start it manually by

sudo systemctl start grafana-server

nothing happens. The bash prompt is not coming back, it seems the command runs forever.

When I look into the logs (/var/log/grafana/grafana.log) the last entry is from some hours ago, about an issue with the TLS certificate. I guess (though I’m not 100% sure) that was when I restarted the server.

logger=server t=2024-06-04T13:40:19.93128066+02:00 level=error msg="Stopped background service" service=*api.HTTPServer reason="could not load SSL certificate: open /etc/grafana/grafana.crt: permission denied"

Not sure why that is. It worked before, I haven’t changed anything with the certificate files.
Anyhow, though I rebooted the server again and tried to start Grafana several times there are no new entries in the log.
Same with the logs from journalctl.
Last entry at the same time as in the logs (probably no surprise), same message.

 Error: ✗ *api.HTTPServer run error: could not load SSL certificate: open /etc/grafana/grafana.crt: >
Jun 04 13:40:19 xxx systemd[1]: grafana-server.service: Main process exited, code=exited, status=1/FAILURE
Jun 04 13:40:19 xxx systemd[1]: grafana-server.service: Failed with result 'exit-code'.
Jun 04 13:40:20 xxx systemd[1]: grafana-server.service: Scheduled restart job, restart counter is at 5.
Jun 04 13:40:20 xxx systemd[1]: Stopped Grafana instance.
Jun 04 13:40:20 xxx systemd[1]: grafana-server.service: Start request repeated too quickly.
Jun 04 13:40:20 xxx.de systemd[1]: grafana-server.service: Failed with result 'exit-code'.
Jun 04 13:40:20 xxx systemd[1]: Failed to start Grafana instance.

Though I’m not sure about the certificate error I tried to work with that message.
But since I have no more entries in the logs, I’m not sure if helps in anyway.

I’m a bit lost what to do now. Is there a debug option that could help me to find the issue?
Thank you very much!

do you use a cert for your grafana?

yes
I’m accessing it via https:// (at least when it is running …)

so what updates did you do on it? what version was it before 11?

Could it be a file permission issue?

1 Like

To be honest - I don’t know :see_no_evil:.
I just did all updates that were available.

Is there maybe a kind of log to look that up?

which updates are you referring to? how did you do it

I meant all the updates available (something like 20 packages) for the server. Including updates not (directly) related to Grafana.
By the way - I have similar issues with InfluxDB, not sure if that information helps.

1 Like

Then something in the (new?) version of Grafana must have changed. It was working with the certificates before, I haven’t changed anything in permissions for the folder or the files themselves.

The certicate files used in Grafana are actually “double links”:

etc/grafana/grafana.crt --> /etc/letsencrypt/live/<server-name>.fullchain.pem --> /etc/letsencrypt/archive/<server-name>.fullchain1.pem

Permission for the actual folder with files is 755, root.grafana, the files have 600, root.grafana (tried already grafana.grafana).

/etc/grafana/grafana.key is handled in the same way.

I’m using the same certificate files also for InfluxDB, just linked to /etc/influxdb/influxdb.crt and /etc/influxdb/influxdb.key.

at this point hard to tell where and how your grafana came to be in this wounded state.

copy out the data folder to another grafana instance maybe on your pc, maybe start version 9 and work your way up or down.

and document everything.

Oh, Ok, I hoped it is possible to find the root cause and fix it.
But OK, I guess I have to do it the hard way …

Thank you anyhow!

1 Like

OK, so who is owner of that file and which user is running grafana?

HI jangaraj,
thanks for your reply.
I tried to give that information in one of my previous answers.

Owner is root, group is grafana.
I guess the user grafana runs Grafana. Actually I can not check, since Grafana is currently not running :pensive:. Is there maybe a way to find that out, even if the software is not executed?

But that is what I also had in mind. When only the owner (root) has rw rights, how can user grafana read it? Therefore I changed (at least for one of the 2 cert-files) the owner to user grafana. I had the impression that it doesn’t help.

And there is the fact that it worked with that exact rights and owner before. I checked that explicitly, since I was in the beginning concerned that the links to the cert-files have all rights for everyone. But then I learned that links have always these rights and important are the rights of the target file. Therefore I looked it up and noted it down. Hence I’m sure that nothing has changed regarding the owners and rights of the cert-files.

As a test I changed the right of the cert-files I found in the logs to 666, but without success.

But anyhow, that is currently the only hint I have.
Since that error-message in the logs I can not find any grafana entry, though I tried a lot to start Grafana again.

What are your thoughts about that owner and rights?

Maybe another topic to look in (besides the permissions for the cert-files) is the fact that now the attempt to start Grafana (sudo systemctl start grafana-server.service) leads to nothing. The command seems to “hang”, the bash-prompt doesn’t come back.
And there are no traces from Grafana in the logs.

I guess I found a solution.
I changed the rights of both cert-files. In the end I stripped it down to ----r-----, only read rights for the group, nothing else. And it makes sense!
One only need to read the cert-files, nobody should write to it.
When grafana runs Grafana, it should have read-rights.
Furthermore: since I also use the same cert-files for InfluxDB, I added during setup also the user influxdb to the group grafana. Now also InfluxDB starts again without issues.

The only question that remains for me: why does it work before with the rights I mentioned above? That was actually misleading me.
There must be “something” that was effective while the server was running but got lost, when I restarted it.

In the end it was easy: just read the error message in the log, take it literally and fix it!

Any final thoughts about the very stripped down rights for the cert-files?

Thank you very much for the three of you taking your time to help me out!
I really appreciate that!

1 Like

Are you sure? I think your LE renewal process needs write access.

1 Like

That’s a very good point!
Maybe I’m falling (again) in the “works now, doesn’t work tomorrow”-trap!
So probably rw-rights for root should be better? Resulting in 640 rights?

Reminds me that I have to look into the renewal process before the certificate expires …

1 Like

I had the same issue when following

https://grafana.com/docs/grafana/latest/setup-grafana/set-up-https/
on Ubuntu 22.04.4 LTS

Changing privilages to 440 (second 4 as read right to group, since grafana is a group) was enough

sudo chmod 440 /etc/grafana/grafana.crt /etc/grafana/grafana.key