Hi bros,
I’m following the doc Enable alerting high availability | Grafana Cloud documentation
I successfully set them up in kubenetes but the 2 grafana instance still send total 2 alerts for one incident. the headless service, per describe cmd, it does connected to the 2 pods(endpoints)
I curl the port using port forwarding, single pod’s 9094, it works but the headless service not. inside pod, the 9094 does listening. in doc, the setting is
ha_peers = "grafana-alerting.grafana:9094" # svc name and namespace?
mine: ha_peers = "grafana-alert:9094" # only svc name I don't think it matters.
searching for 4 days but still no clue what’s going wrong.
here are some config
service.yaml:
apiVersion: v1
kind: Service
metadata:
name: {{ include "roma-grafana.fullname" . }}
labels:
{{- include "roma-grafana.labels" . | nindent 4 }}
spec:
type: {{ .Values.service.type }}
ports:
- port: {{ .Values.service.port }}
targetPort: http
protocol: TCP
name: http
selector:
{{- include "roma-grafana.selectorLabels" . | nindent 4 }}
---
apiVersion: v1
kind: Service
metadata:
name: grafana-alert
namespace: wap-roma-dev
labels:
app.kubernetes.io/name: grafana-alert
app.kubernetes.io/part-of: grafana
spec:
type: ClusterIP
{{/* clusterIP: 'None'*/}}
ports:
- port: 9094
targetPort: 9094
protocol: TCP
name: grafana-alert
selector:
{{- include "roma-grafana.selectorLabels" . | nindent 4 }}
part of deployment.yml
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.global.dockerRegistry }}/{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
env:
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
command:
- /bin/sh
- -exc
- |
echo "Initializing grafana.ini"
echo "Pod ip: ${POD_IP}"
cp /etc/grafana/grafana.initemp /etc/grafana/grafana.ini
sed -i -e "s/POD_IP_PARAM/${POD_IP}/g" /etc/grafana/grafana.ini
sh /run.sh
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- name: http
containerPort: {{ .Values.service.port }}
protocol: TCP
- name: grafana-alert
containerPort: 9094
protocol: TCP
part of configmap
[unified_alerting]
disabled_orgs =
admin_config_poll_interval = 60s
alertmanager_config_poll_interval = 60s
enabled = true
ha_listen_address = "POD_IP_PARAM:9094"
ha_peers = "grafana-alert:9094"
ha_advertise_address = "POD_IP_PARAM:9094"
describe svc
Name: grafana-alert
Namespace: wap-roma-dev
Labels: app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=grafana-alert
app.kubernetes.io/part-of=grafana
Annotations: meta.helm.sh/release-name: roma-grafana
meta.helm.sh/release-namespace: wap-roma-dev
Selector: app.kubernetes.io/instance=roma-grafana,app.kubernetes.io/name=roma-grafana
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 172.30.8.64
IPs: 172.30.8.64
Port: grafana-alert 9094/TCP
TargetPort: 9094/TCP
Endpoints: 172.18.66.176:9094,172.18.67.253:9094
Session Affinity: None
Events: <none>
describe pods
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
roma-grafana-58cd7d7456-whdc4 1/1 Running 0 4m18s 172.18.67.253 k8snode3.wsjcint-gen-a.int.infra.xxxx.com <none> <none>
roma-grafana-58cd7d7456-z4nv9 1/1 Running 0 4m18s 172.18.66.176 k8snode7.wsjcint-gen-a.int.infra.xxxx.com <none> <none>
netstat -tuln
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 172.18.67.253:9094 0.0.0.0:* LISTEN
tcp 0 0 :::3000 :::* LISTEN
udp 0 0 172.18.67.253:9094 0.0.0.0:*