K8s monitoring - how to get metrics of pod / service?

Hi there :wave:t2:

I have a k8s cluster with some services / applications in it. Base metrics of kubernetes are working fine, dashboards are populated with data and I cann see the pods in the default kubernetes dashboards.

I am now struggling to create SLOs based on the metrics of some applications and services.
For example: we have a soketi server running which exposes metrics on port 9601. I can gather the metrics from another pod with curl http://soketi.soketi.sv.cluster.local:9601/metrics.

But how do I get the grafana agent / alloy configured to push these metrics into grafana cloud?
If I try to find any of these metrics in my stack, I am not able to find it. I searched across the documentation, came up with this article: Scrape and forward application metrics | Grafana Cloud documentation

I tried it and added the pod for soketi, but still no metrics after a few days. One note: I come from datadog where we configured the agent to scrape all pods if they have a specific annotation.

I inspected the config.alloy file and as far as I understood it, metrics of pods should be automatically be sent to grafana cloud.

Are there any best practices for this scenario? It seems a little bit odd to directly change the configuration of grafana if I just want to add metrics of an specific pod :face_with_raised_eyebrow: I am sure I am missing some important step / configuration but not able to find it. Any help is appreciated :pray:t2:

Best
Martin

P.S.: I installed the grafana agent with the helm chart as instructed in grafana cloud itself

There is the same option here:

1 Like

Thank you! That looks exactly like the thing I am looking for :grinning: Do I understand correctly that I have to replace my current helm installation with the helm chart of your link in order to enable this feature?

Best
Martin

I don’t know which helm did you installed.

I would recommend that helm chart 2 version. It is still release candidate, but it should be released soon what I heard.

I installed it with the tutorial in grafana cloud directly. If you navigate to https://<your-stack>.grafana.net/a/grafana-k8s-app/configuration?from=now-1h&to=now there is a page with steps to integrate k8s into grafana cloud with a helm installation command at the end.

I think it is version 1 as of these lines:

helm repo add grafana https://grafana.github.io/helm-charts &&
  helm repo update &&
  helm upgrade --install --version ^1 --atomic --timeout 300s grafana-k8s-monitoring grafana/k8s-monitoring ....

But I checked the config.alloy file on the grafana pod and found these:

// Annotation Autodiscovery
discovery.relabel "annotation_autodiscovery_pods" {
  targets = discovery.kubernetes.pods.targets
  rule {
    source_labels = ["__meta_kubernetes_pod_annotation_k8s_grafana_com_scrape"]
    regex = "true"
    action = "keep"
  }
  rule {
    source_labels = ["__meta_kubernetes_pod_annotation_k8s_grafana_com_job"]
    action = "replace"
    target_label = "job"
  }
  rule {
    source_labels = ["__meta_kubernetes_pod_annotation_k8s_grafana_com_instance"]
    action = "replace"
    target_label = "instance"
  }
  rule {
    source_labels = ["__meta_kubernetes_pod_annotation_k8s_grafana_com_metrics_path"]
    action = "replace"
    target_label = "__metrics_path__"
  }

I think the annotation autodiscovery should be enabled. I added the the annotations to the pod and now I am waiting to retrieve the first metrics. As far as I understand this should work now.

2 Likes

Hello! I’m the primary author of the k8s-monitoring Helm chart!

Yeah, both v1 and v2 have the annotation-based autodiscovery feature. If you add k8s.grafana.com/scrape: true on your pods or services, it should detect them. In v1, it’s enabled by default. In v2, you enable them by setting:

annotationAutodiscovery:
  enabled: true
2 Likes

Hi Pete,

awesome, thank you! It works with the annotations, I did succesfully get all the metrics of the applications and nginx :star_struck:

I also found out we still using the v1 helm chart (which is btw still in the grafana cloud as example, maybe they can update this?) and created a ticket for ourself to replace it with the v2 version.

Thanks to all of you! Great help :clap:

Best
Martin

Thank you for sharing your experience i think this is the point i was looking for.