Alloy unable to read Kubernetes cluster events

Hello!

I recently deployed Alloy to scrape K8s events using this helm chart. values.yaml is included below.

alloy:
  configMap:
    create: true
    content: |-
      logging {
        level = "info"
        format = "logfmt"
      }
      loki.write "default" {
        endpoint {
          url = "http://loki.loki.svc.cluster.local:3100/loki/api/v1/push"
        }
      }
      loki.source.kubernetes_events "cluster_events" {
        job_name   = "integrations/kubernetes/eventhandler"
        log_format = "logfmt"
        forward_to = [
          loki.process.cluster_events.receiver,
        ]
      }
      loki.process "cluster_events" {
        forward_to = [loki.write.default.receiver]
        stage.labels {
          values = {
            kubernetes_cluster_events = "job",
          }
        }
      }
  mounts:
    varlog: false
    dockercontainers: false
controller:
  type: deployment
  replicas: 1

The controller pod currently has the following errors. Any ideas what am I missing?

ts=2025-06-09T10:16:49.689635377Z level=error msg="event watcher exited with error" component_path=/ component_id=loki.source.kubernetes_events.cluster_events err="failed to configure informers: failed to get server groups: Get \"https://172.20.0.1:443/api\": dial tcp 172.20.0.1:443: connect: connection refused"
ts=2025-06-09T10:16:49.689737301Z level=info msg="stopping watcher for events" component_path=/ component_id=loki.source.kubernetes_events.cluster_events namespace=""

To me, the appears to show an issue communicating with the Kubernetes API server from within the Alloy pod.

loki.source.kubernetes_events asks the Api Server to stream the k8s events.

What kind of K8s cluster are you running? Any special requirements?

Thanks for the nudge!

Cluster is nothing special really, just a regular EKS cluster with Istio.

I exec’d into the Alloy pod, installed curl, and tried to reach the kube_api_server. Outputs are below. Since I get a 401 Unauthorized response (didn’t provide any auth), I think the connectivity/route is fine?


# printenv | grep KUBERNETES_
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_SERVICE_PORT=443
KUBERNETES_PORT_443_TCP=tcp://172.20.0.1:443
KUBERNETES_PORT_443_TCP_PROTO=tcp
KUBERNETES_PORT_443_TCP_ADDR=172.20.0.1
KUBERNETES_SERVICE_HOST=172.20.0.1
KUBERNETES_PORT=tcp://172.20.0.1:443
KUBERNETES_PORT_443_TCP_PORT=443

# curl -v https://172.20.0.1:443 -k
*   Trying 172.20.0.1:443...
* Connected to 172.20.0.1 (172.20.0.1) port 443
* ALPN: curl offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Request CERT (13):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Certificate (11):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256 / X25519 / RSASSA-PSS
* ALPN: server accepted h2
* Server certificate:
*  subject: CN=kube-apiserver
*  start date: Jun 09 03:00:39 2025 GMT
*  expire date: Jun 09 03:05:39 2026 GMT
*  issuer: CN=kubernetes
*  SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
*   Certificate level 0: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://172.20.0.1:443/
* [HTTP/2] [1] [:method: GET]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: 172.20.0.1]
* [HTTP/2] [1] [:path: /]
* [HTTP/2] [1] [user-agent: curl/8.5.0]
* [HTTP/2] [1] [accept: */*]
> GET / HTTP/2
> Host: 172.20.0.1
> User-Agent: curl/8.5.0
> Accept: */*
> 
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* received GOAWAY, error=0, last_stream=1
< HTTP/2 401 
< audit-id: a3e4412b-0c1f-4a36-b7d7-4f981b136e83
< cache-control: no-cache, private
< content-type: application/json
< content-length: 157
< date: Tue, 10 Jun 2025 21:46:11 GMT
< 
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "Unauthorized",
  "reason": "Unauthorized",
  "code": 401
* Closing connection
* TLSv1.3 (OUT), TLS alert, close notify (256):

Edit: By using the appropriate cert and token, I get a 200 OK response.

# curl --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt --header "Authorization: Bearer $(cat /var/run/secrets/ubernetes.io/serviceaccount/token)" -X GET https://172.20.0.1:443/api       
{
  "kind": "APIVersions",
  "versions": [
    "v1"
  ],
  "serverAddressByClientCIDRs": [
    {
      "clientCIDR": "0.0.0.0/0",
      "serverAddress": "ip-<ip>.<region>.compute.internal:443"
    }
  ]
}

Turned out, it wasn’t an Alloy issue.

After the manual troubleshooting described in the above post, I launched Alloy by disabling Istio sidecar. That works partially. Alloy was able to connect to Kubernetes API server but wasn’t able to write to Loki which was within the mesh.

This made me suspect Istio mesh, and the proxy sidecar in particular.

Further troubleshooting revealed, Alloy container was starting up well before Istio proxy sidecar was ready.

I added this Istio specific annotation to the Alloy pod, which delayed the Alloy container startup and fixed this issue.

proxy.istio.io/config: '{"holdApplicationUntilProxyStarts": true}'

This property can also be set mesh wide, more details can be found here - Istio / Global Mesh Options