Trying to setup a grafana deployment in kubernetes cluster

I’ve been trying to setup a blank grafana instance in my local kubernetes cluster. I started with my own idea with standard kubernetes yaml files. I had some issues with read access wich i solved. But had issues where i could not get ingress working or access the pod in any way.
I then found the offical documentation on setting up a grafana instance in kubernetes at grafana labs. Except for some minor additions and adjustments like putting it in a namespace and using nfs mounts instead of PersistentVolumeClaim’s i pretty much have it straight of from the documentation.

Deployment yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: grafana
  name: grafana
  namespace: monitoring
spec:
  replicas: 1
  revisionHistoryLimit: 1
  selector:
    matchLabels:
      app: grafana
  template:
    metadata:
      labels:
        app: grafana
    spec:
      securityContext:
        fsGroup: 472
        supplementalGroups:
          - 0
      containers:
        - name: grafana
          image: grafana/grafana:9.1.0
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 3000
              name: http-grafana
              protocol: TCP
          readinessProbe:
            failureThreshold: 3
            httpGet:
              path: /robots.txt
              port: 3000
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 30
            successThreshold: 1
            timeoutSeconds: 2
          livenessProbe:
            failureThreshold: 3
            initialDelaySeconds: 30
            periodSeconds: 10
            successThreshold: 1
            tcpSocket:
              port: 3000
            timeoutSeconds: 1
          resources:
            requests:
              cpu: 250m
              memory: 750Mi
          volumeMounts:
            - mountPath: /var/lib/grafana
              name: nfs-volume-grafana
      volumes:
        - name: nfs-volume-grafana
          nfs:
            server: 192.168.1.198
            path: /mnt/storage/k8s/grafana/data

Service yaml:

apiVersion: v1
kind: Service
metadata:
  name: svc-grafana
  namespace: monitoring
spec:
  ports:
    - port: 3000
      protocol: TCP
      targetPort: http-grafana
  selector:
    app: grafana
  sessionAffinity: None
  type: LoadBalancer

When checking the pod i get:

kubectl describe pod grafana-785b58dc67-t29ql -n monitoring
Name:             grafana-785b58dc67-t29ql
Namespace:        monitoring
Priority:         0
Service Account:  default
Node:             kube-node-02/192.168.1.202
Start Time:       Sun, 23 Apr 2023 12:28:28 +0200
Labels:           app=grafana
                  environment=prod
                  pod-template-hash=785b58dc67
Annotations:      cni.projectcalico.org/containerID: f1e2725f643f14a90ca61f76ae9cdd5725489b39d006b56cfd5327a3428e0b47
                  cni.projectcalico.org/podIP: 10.105.241.25/32
                  cni.projectcalico.org/podIPs: 10.105.241.25/32
                  kubectl.kubernetes.io/restartedAt: 2023-04-23T10:28:28Z
Status:           Running
IP:               10.105.241.25
IPs:
  IP:           10.105.241.25
Controlled By:  ReplicaSet/grafana-785b58dc67
Containers:
  grafana:
    Container ID:   containerd://4721a238d1c1aaed46c0170d6fd344ec2ae40305cca1144bb948377cb5ff0fd1
    Image:          grafana/grafana:9.1.0
    Image ID:       docker.io/grafana/grafana@sha256:3755790fae9130975b0a778ea7c61e54627550541cf90f0aa5f11fa8936468c9
    Port:           3000/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Sun, 23 Apr 2023 12:30:21 +0200
    Last State:     Terminated
      Reason:       Error
      Exit Code:    2
      Started:      Sun, 23 Apr 2023 12:29:20 +0200
      Finished:     Sun, 23 Apr 2023 12:30:21 +0200
    Ready:          False
    Restart Count:  2
    Requests:
      cpu:        250m
      memory:     750Mi
    Liveness:     tcp-socket :3000 delay=30s timeout=1s period=10s #success=1 #failure=3
    Readiness:    http-get http://:3000/robots.txt delay=10s timeout=2s period=30s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /var/lib/grafana from nfs-volume-grafana (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-bgwqx (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  nfs-volume-grafana:
    Type:      NFS (an NFS mount that lasts the lifetime of a pod)
    Server:    192.168.1.198
    Path:      /mnt/storage/k8s/grafana/data
    ReadOnly:  false
  kube-api-access-bgwqx:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  2m26s                default-scheduler  Successfully assigned monitoring/grafana-785b58dc67-t29ql to kube-node-02
  Warning  Unhealthy  36s (x6 over 116s)   kubelet            Liveness probe failed: dial tcp 10.105.241.25:3000: connect: connection refused
  Warning  Unhealthy  36s (x4 over 116s)   kubelet            Readiness probe failed: Get "http://10.105.241.25:3000/robots.txt": dial tcp 10.105.241.25:3000: connect: connection refused
  Normal   Killing    36s (x2 over 96s)    kubelet            Container grafana failed liveness probe, will be restarted
  Normal   Pulled     34s (x3 over 2m26s)  kubelet            Container image "grafana/grafana:9.1.0" already present on machine
  Normal   Created    34s (x3 over 2m26s)  kubelet            Created container grafana
  Normal   Started    34s (x3 over 2m26s)  kubelet            Started container grafana

As can be seen here i get issues with the readiness where connection is refused on port 3000.

Warning  Unhealthy  36s (x6 over 116s)   kubelet            Liveness probe failed: dial tcp 10.105.241.25:3000: connect: connection refused
  Warning  Unhealthy  36s (x4 over 116s)   kubelet            Readiness probe failed: Get "http://10.105.241.25:3000/robots.txt": dial tcp 10.105.241.25:3000: connect: connection refused

Also here is the logs comming out of the pod, but no errors or issues there:


logger=settings t=2023-04-23T10:35:57.167950388Z level=info msg="Starting Grafana" version=9.1.0 commit=82e32447b4 branch=HEAD compiled=2022-08-16T09:30:13Z
logger=settings t=2023-04-23T10:35:57.168271631Z level=info msg="Config loaded from" file=/usr/share/grafana/conf/defaults.ini
logger=settings t=2023-04-23T10:35:57.168310148Z level=info msg="Config loaded from" file=/etc/grafana/grafana.ini
logger=settings t=2023-04-23T10:35:57.168324559Z level=info msg="Config overridden from command line" arg="default.paths.data=/var/lib/grafana"
logger=settings t=2023-04-23T10:35:57.168337404Z level=info msg="Config overridden from command line" arg="default.paths.logs=/var/log/grafana"
logger=settings t=2023-04-23T10:35:57.168349565Z level=info msg="Config overridden from command line" arg="default.paths.plugins=/var/lib/grafana/plugins"
logger=settings t=2023-04-23T10:35:57.168361666Z level=info msg="Config overridden from command line" arg="default.paths.provisioning=/etc/grafana/provisioning"
logger=settings t=2023-04-23T10:35:57.168373804Z level=info msg="Config overridden from command line" arg="default.log.mode=console"
logger=settings t=2023-04-23T10:35:57.168386169Z level=info msg="Config overridden from Environment variable" var="GF_PATHS_DATA=/var/lib/grafana"
logger=settings t=2023-04-23T10:35:57.168398469Z level=info msg="Config overridden from Environment variable" var="GF_PATHS_LOGS=/var/log/grafana"
logger=settings t=2023-04-23T10:35:57.168410747Z level=info msg="Config overridden from Environment variable" var="GF_PATHS_PLUGINS=/var/lib/grafana/plugins"
logger=settings t=2023-04-23T10:35:57.168422761Z level=info msg="Config overridden from Environment variable" var="GF_PATHS_PROVISIONING=/etc/grafana/provisioning"
logger=settings t=2023-04-23T10:35:57.168437504Z level=info msg="Path Home" path=/usr/share/grafana
logger=settings t=2023-04-23T10:35:57.168449844Z level=info msg="Path Data" path=/var/lib/grafana
logger=settings t=2023-04-23T10:35:57.168461535Z level=info msg="Path Logs" path=/var/log/grafana
logger=settings t=2023-04-23T10:35:57.168473125Z level=info msg="Path Plugins" path=/var/lib/grafana/plugins
logger=settings t=2023-04-23T10:35:57.16848504Z level=info msg="Path Provisioning" path=/etc/grafana/provisioning
logger=settings t=2023-04-23T10:35:57.168496923Z level=info msg="App mode production"
logger=sqlstore t=2023-04-23T10:35:57.168674711Z level=info msg="Connecting to DB" dbtype=sqlite3

When checking the service i get:

kubectl describe service svc-grafana -n monitoring
Name:                     svc-grafana
Namespace:                monitoring
Labels:                   app.kubernetes.io/instance=grafana
Annotations:              <none>
Selector:                 app=grafana
Type:                     LoadBalancer
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.107.217.240
IPs:                      10.107.217.240
LoadBalancer Ingress:     192.168.1.228
Port:                     <unset>  3000/TCP
TargetPort:               http-grafana/TCP
NodePort:                 <unset>  30036/TCP
Endpoints:
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>

I have tried to connect with http://192.168.1.228:30036/ on node IP but that wont work. Port 3000 wont work either but i dont expect that since that is internal in the kubernetes cluster.

I have also tried using kubectl port-forward service/svc-grafana 3000:3000 -n monitoring from my control plane and then connect on http://{control plane IP}:3000/ but same result, not getting through. I assume it all comes back to the liveness probe issue, since that is not even getting through it’s no wonder i cannot even access the instance.

Hi @azazel666,

Thanks for opening this post.

Are you using Minkube or KinD or any other flavours?

Im using 3 node setup, all virtual servers running in my on-premise server rack. All nodes are installed on Ubuntu Server 22.04.

Thanks, let me try this on my local machine and will get back to you

Ok I tested this out on my local Minikube and use the complete grafana.yaml on the documentation page (without any modification) and it works fine.

What only thing I did is created a new namespace “monitoring”

[developer@k8 ~]kubectl create namespace monitoring

and apply the grafana.yaml into it using command

[developer@k8 ~]kubectl apply -f grafana.yaml --namespace=monitoring

After that I expose the port 3000 for port forwarding;

[developer@k8 ~]kubectl port-forward grafana-58445b6986-9fkhk --address 0.0.0.0 3000:3000 --namespace=monitoring

Then found out my minikube IP Address using command minikube ip and was reachable on my guest machine using IP:3000

HI @azazel666
Do you remember how you solved that ?
I have the same error as you.
The log end with

level=info msg="Connecting to DB" dbtype=sqlite3

and I have the

Liveness probe failed
Readiness probe failed

but I use helm.

Thank you

I never solved it. I never got around to re-test things and that cluster i did experiment on is no longer. I have a new cluster running but have yet to get to trying to setup a new grafana in it.

1 Like

Thank you for your reply. I bypassed it with this