Prometheus and Grafana Best Practice

cgillies1972 · March 28, 2021, 6:42am

We have a number of different services in kubernetes that we would like to setup monitoring for. I’ve found that a lot of helm charts have options to install prometheus and grafana that are pre-configured for that service. This makes the setup of monitoring for that service easier. This approach also means that you end up with multiple copies of prometheus and grafana running.

My question is, within a kubernetes cluster what is the best practice for installing prometheus and grafana?

Should there be a single installation of prometheus and grafana that all other services register with, or
Should you allow each service to install their own instance of prometheus and grafana.

My initial thoughts would be to use a single installation that all other services would register their metrics endpoints with but I’m not sure if this is correct? Are there limitations with using a single installation that I need to be aware of?

fadjar340 · March 28, 2021, 12:25pm

For small system, it’s enough to have single installation. If you need redundancy, please check Thanos for Prometheus installation, also for Grafana. My experience, single installation for grafana is enough for more than 500 physical machines in my cluster, also I have daily dump of the grafana mysql for backup
Please see #1…

The limitation is merely in Prometheus that need to check the disk usage, you can build dashboard for this to monitor the Prometheus disk usage. Check the default location /var/lib/prometheus or other location that reside in the config.
For Grafana, if the concurrent users that connect to Grafana less than 100, you just need single instance with adequate memory and CPU only. I suggest use mysql as grafana backend.

Regards,
Fadjar Tandabawana

cgillies1972 · March 28, 2021, 8:54pm

Thanks Fadjar for the feedback.

Our K8 cluster is locally hosted and I can’t see us having more than 10 worker nodes in the cluster in the near future so it sounds like a single instance of prometheus & grafana will be adequate.

I’ll look into the prometheus storage - I assume the disk storage you need is a function of how many metrics you are planning on scraping?

fadjar340 · March 29, 2021, 3:02am

Indeed…

Below my Prometheus dashboard for 90 days in the small cluster.

cgillies1972 · March 29, 2021, 9:20am

Thanks Fadjar - that gives me a good place to start with the initial storage size

Topic		Replies	Views
Architecture advice: how to gather metrics from many places Configuration kubernetes , metrics	0	317	April 3, 2024
Grafana infrastructure Installation	1	616	March 14, 2022
Grafana Dashboard results doubled due to 2 Prometheus replicas Dashboards	2	1312	February 19, 2025
I need help to understand how can I get grafana deployed on the servers to get the metrics from our locally installed prometheus Live	7	539	April 4, 2023
Server Sizing required for Grafana Installation templating , api , postgres , alerting , provisioning	2	3519	February 15, 2019

Prometheus and Grafana Best Practice

Related topics