Zone aware metrics scraping?

For a highly available Kubernetes clusters, metrics scraping can transfer lots of bytes between zones, which costs money in cloud environments like AWS.

Right now I have two central “metrics” alloy pods which use clustering to split up the various metrics I scrape. Mainly I configure monitoring through ServiceMonitor and PodMonitor CRDs.

I’m wondering if there’s a way in which I could tell each metric pod to only scrape metrics within its own AZ. For example, I could have one metrics pod in us-east-1a, which only scrapes pods in us-east-1a, and another for us-east-1b, etc. All the pods in my cluster have the topology.kubernetes.io/zone zone label, so I imagine I can do something with that?

I do a lot of metric relabeling to reduce how many series ultimately get sent to Mimir, but that doesn’t affect the initial network transfer from the metrics pod to Alloy.

Just curious what folks are doing about this. I suppose I could switch to a Daemonset deployment and have each pod only scrape metrics on its node, but I prefer the centralized version since the load will be more predictable.

I managed to mostly solve my own problem here. For reference, every pod in my cluster has a label topology.kubernetes.io/zone=..., I’m not sure if this is standard or not.

What I did is deploy 1 alloy “metrics” pod per AZ (for me this just means 3 replicas with topologySpreadConstraints).

Next, I attached the zone of alloy agent to itself. By helm this is:

    alloy:
      extraEnv:
        - name: CUSTOM_AWS_AZ
          valueFrom:
            fieldRef:
              fieldPath: metadata.labels['topology.kubernetes.io/zone']

This pulls in the AZ from the pod’s zone metadata label.

Then, for my Alloy config, where I pull in ServiceMonitor and PodMonitor, I added a rule:

prometheus.operator.servicemonitors "cluster" {
  forward_to = [prometheus.remote_write.mimirprod.receiver]

  // Disable clustering since we are zone aware
  clustering {
    enabled = false
  }
  
  // Only scrape targets in the same AZ
  rule {
    source_labels = ["__meta_kubernetes_pod_label_topology_kubernetes_io_zone"]
    regex         = sys.env("CUSTOM_AWS_AZ") 
    action        = "keep"
  }

  scrape {
    default_scrape_interval = "30s"
  }
}

prometheus.operator.podmonitors "cluster" {
  forward_to = [prometheus.remote_write.mimirprod.receiver]

  // Disable clustering since we are zone aware
  clustering {
    enabled = false
  }

  // Only scrape targets in the same AZ
  rule {
    source_labels = ["__meta_kubernetes_pod_label_topology_kubernetes_io_zone"]
    regex         = sys.env("CUSTOM_AWS_AZ") 
    action        = "keep"
  }

  scrape {
    default_scrape_interval = "30s"
  }
}

It’s very important to disable clustering because clustering should only be used when multiple alloy are scraping the same pods, but in this case they have unique pods.

This pattern can probably then be used for other discoveries, I just mainly cared about using it for my PodMonitor and ServiceMonitor objects.