Grafana Dashboard results doubled due to 2 Prometheus replicas

glaubmarcel · February 8, 2024, 2:47pm

What Grafana version and what operating system are you using?
Grafana v10.0.3 on Kubernetes Cluster
What are you trying to achieve?
Build a Dashboard with some basic information about the Cluster like Node count, Pod count…
How are you trying to achieve it?
sum(cluster:master_nodes)
What happened?
Showed 6 Master Nodes
What did you expect to happen?
Showing 3 Master Nodes
Did you receive any errors in the Grafana UI or in related logs? If so, please tell us exactly what they were.
I didnt receive any errors.

I looked into the query responses and saw that i get the same data from 2 Prometheus replicas. Is there an option to consolidate the data if its the same but from 2 replicas? I did some googling and found how to do it if you have 2 seperate datasources e.g. Prometheus and InfluxDb. but in my case it counts as 1 Datasource in grafana since its one Prometheus that hast 2 replicas due to HA.

Cheers
Marcel

dylanpaulson · October 29, 2024, 6:47pm

Hi everyone,

I’m also encountering this issue after increasing the Prometheus replicas to 2 for high availability (HA). This is causing the following metrics to display incorrect values, effectively doubling them:

cluster:namespace:pod_cpu:active:kube_pod_container_resource_limits
cluster:namespace:pod_cpu:active:kube_pod_container_resource_requests
kube_node_status_capacity
kube_node_status_allocatable
Any metric with the instance label

Problem:

The main issue is that each metric appears to be logged twice, causing doubled results when summed. For example, metrics like node capacity and allocation display duplicate values for the same nodes.

Example:
The table format shows that there are multiple entries for the same node, each with identical capacity values. See the tables below:

node	instance	Capacity
node-01.example.com	`[1234:abcd::1]:9090`	128 GiB
node-01.example.com	`[1234:abcd::2]:9090`	128 GiB

Here’s an example confirming that different instance values actually correspond to the same node UUID:

node	instance	system_uuid
node-01.example.com	`[1234:abcd::1]:9090`	`uuid-1234-abcd-5678-efgh`
node-01.example.com	`[1234:abcd::2]:9090`	`uuid-1234-abcd-5678-efgh`

Cause:

From my investigation, it seems that each node in the cluster reports the same two instance values as every other node. These instance values change over time but remain identical across the cluster at any given time. The metrics are correct when only one instance is present, so this duplication from both Prometheus replicas seems to be at the root of the issue.

Questions:

If anyone has dealt with this before, I would appreciate any guidance on:

Understanding the origin of the instance label in these duplicated metrics
Identifying why each node is reported with multiple instance values

Thanks in advance for any help!

magdalenaplucinskast · February 19, 2025, 8:49am

I have the same problem. Have you found any solution for this issue?

Topic		Replies	Views
Metrics appear duplicated in dashboard panel Dashboards	2	1996	January 23, 2025
Gauge showing up twice Prometheus	2	2345	August 31, 2021
Duplicate Outputs in Grafana Table Prometheus	1	7527	July 31, 2017
HA Prometheus and Grafana Cloud Grafana Cloud	1	1112	July 22, 2022
Pod resources allocated is doubled Grafana templating , kubernetes	0	554	February 28, 2023

Grafana Dashboard results doubled due to 2 Prometheus replicas

Problem:

Cause:

Questions:

Related topics