Grafana Cloud forbidding to push logs to Loki from two instances

I have a small Kubernetes cluster with Vector configured to as a StatefulSet pushing logs to Loki hosted in Grafana Cloud (Free tier, as I’m just getting started). My credentials are for testing purposes hardcoded in the configuration file.

When I start the StatefulSet, all logs are sent to Loki and I can see my logs in Grafana. However, if the pod is restarted and scheduled on another Kubernetes node (different server), Loki returns 403 .

Is Grafana’s Free tier allowing only 1 host to push logs to Loki?

Hello Artuross!

Thanks for reaching out to the community forums.

There is no limitation on a Grafana Cloud free account based on hosts.

A free account is limited to the total size of logs ingested over the course of a month.
For a free account this is 50GB of Logs.
See more information on the pricing page below.

You may want to look at the resource definition of the Vector Statefulset.
I found the helm chart shows this as an aggregator.

The loki sink configuration should also be reasoned with, there may be something wrong there as well.

Hey @peterolivo. Thanks for looking at my problem. You’re right that I’m running an Aggregator. I was actually planning to use the Agent (DaemonSet), but early on realized that only one of my pods can actually push to Loki. I have since then switched to Aggregator (hoping that it was my config that was the problem), but unfortunately the issue still persists.

sources:
  vector:
    type: vector
    version: '2'
    address: 0.0.0.0:6000

sinks:
  # send logs to Grafana Cloud Loki
  grafana_cloud_logs:
    type: loki
    inputs:
      - vector
    auth:
      strategy: basic
      user: '4....8'
      password: 'glc_...'
    endpoint: https://logs-prod-eu-west-0.grafana.net
    healthcheck:
      enabled: false
    encoding:
      codec: json
    batch:
      max_bytes: 400000
    out_of_order_action: accept
    remove_label_fields: true
    remove_timestamp: true
    labels:
      source: '{{ source }}'
      hostname: '{{ hostname }}'

As you can see, my config is relatively basic, and just depending on where the pod is scheduled, it either works or not.

I’ve collected a bit of logs to help debug this problem. This is from working node:

vector-aggregator-0 vector 2023-08-25T20:39:45.165880Z DEBUG sink{component_kind="sink" component_id=grafana_cloud_logs component_type=loki component_name=grafana_cloud_logs}:request{request_id=1}:http: vector::internal_events::http_client: Sending HTTP request. uri=https://logs-prod-eu-west-0.grafana.net/loki/api/v1/push method=POST version=HTTP/1.1 headers={"content-type": "application/x-protobuf", "content-encoding": "snappy", "authorization": Sensitive, "user-agent": "Vector/0.32.1 (aarch64-unknown-linux-gnu 9965884 2023-08-21 14:52:38.330227446)", "accept-encoding": "identity"} body=[8405 bytes]
vector-aggregator-0 vector 2023-08-25T20:39:45.167184Z DEBUG hyper::client::connect::dns: resolving host="logs-prod-eu-west-0.grafana.net"
vector-aggregator-0 vector 2023-08-25T20:39:45.170980Z DEBUG sink{component_kind="sink" component_id=grafana_cloud_logs component_type=loki component_name=grafana_cloud_logs}:request{request_id=1}:http: hyper::client::connect::http: connecting to 34.120.232.185:443
vector-aggregator-0 vector 2023-08-25T20:39:45.177172Z DEBUG sink{component_kind="sink" component_id=grafana_cloud_logs component_type=loki component_name=grafana_cloud_logs}:request{request_id=1}:http: hyper::client::connect::http: connected to 34.120.232.185:443
vector-aggregator-0 vector 2023-08-25T20:39:45.192046Z DEBUG hyper::proto::h1::io: flushed 8880 bytes
vector-aggregator-0 vector 2023-08-25T20:39:45.295955Z DEBUG hyper::proto::h1::io: parsed 3 headers
vector-aggregator-0 vector 2023-08-25T20:39:45.295986Z DEBUG hyper::proto::h1::conn: incoming body is empty
vector-aggregator-0 vector 2023-08-25T20:39:45.296064Z DEBUG sink{component_kind="sink" component_id=grafana_cloud_logs component_type=loki component_name=grafana_cloud_logs}:request{request_id=1}:http: hyper::client::pool: pooling idle connection for ("https", logs-prod-eu-west-0.grafana.net)
vector-aggregator-0 vector 2023-08-25T20:39:45.296184Z DEBUG sink{component_kind="sink" component_id=grafana_cloud_logs component_type=loki component_name=grafana_cloud_logs}:request{request_id=1}:http: vector::internal_events::http_client: HTTP response. status=204 No Content version=HTTP/1.1 headers={"date": "Fri, 25 Aug 2023 20:39:45 GMT", "via": "1.1 google", "alt-svc": "h3=\":443\"; ma=2592000,h3-29=\":443\"; ma=2592000"} body=[empty]

You may notice that first line in this fragment is the request, the last one is the response. This is taken from Vector as is, I didn’t remove any lines in between these two boundaries.

Now, from the not working pod:

vector-aggregator-0 vector 2023-08-25T20:48:18.321636Z DEBUG sink{component_kind="sink" component_id=grafana_cloud_logs component_type=loki component_name=grafana_cloud_logs}:request{request_id=1}:http: vector::internal_events::http_client: Sending HTTP request. uri=https://logs-prod-eu-west-0.grafana.net/loki/api/v1/push method=POST version=HTTP/1.1 headers={"content-type": "application/x-protobuf", "content-encoding": "snappy", "authorization": Sensitive, "user-agent": "Vector/0.32.1 (aarch64-unknown-linux-gnu 9965884 2023-08-21 14:52:38.330227446)", "accept-encoding": "identity"} body=[4555 bytes]
vector-aggregator-0 vector 2023-08-25T20:48:18.322131Z DEBUG hyper::client::connect::dns: resolving host="logs-prod-eu-west-0.grafana.net"
vector-aggregator-0 vector 2023-08-25T20:48:18.326790Z DEBUG sink{component_kind="sink" component_id=grafana_cloud_logs component_type=loki component_name=grafana_cloud_logs}:request{request_id=1}:http: hyper::client::connect::http: connecting to 34.120.232.185:443
vector-aggregator-0 vector 2023-08-25T20:48:18.333232Z DEBUG sink{component_kind="sink" component_id=grafana_cloud_logs component_type=loki component_name=grafana_cloud_logs}:request{request_id=1}:http: hyper::client::connect::http: connected to 34.120.232.185:443
vector-aggregator-0 vector 2023-08-25T20:48:18.344621Z DEBUG hyper::proto::h1::io: flushed 5030 bytes
vector-aggregator-0 vector 2023-08-25T20:48:18.456609Z DEBUG hyper::proto::h1::io: parsed 5 headers
vector-aggregator-0 vector 2023-08-25T20:48:18.456631Z DEBUG hyper::proto::h1::conn: incoming body is content-length (311 bytes)
vector-aggregator-0 vector 2023-08-25T20:48:18.456649Z DEBUG hyper::proto::h1::conn: incoming body completed
vector-aggregator-0 vector 2023-08-25T20:48:18.456846Z DEBUG sink{component_kind="sink" component_id=grafana_cloud_logs component_type=loki component_name=grafana_cloud_logs}:request{request_id=1}:http: vector::internal_events::http_client: HTTP response. status=403 Forbidden version=HTTP/1.1 headers={"content-type": "text/html; charset=UTF-8", "referrer-policy": "no-referrer", "content-length": "311", "alt-svc": "h3=\":443\"; ma=2592000,h3-29=\":443\"; ma=2592000", "connection": "close"} body=[311 bytes]

You can see the diff here: Vector Grafana Cloud - Diff Checker. The request is identical, except for the small difference in body size. I believe there’s something on Grafana Cloud side that’s blocking it.

The 403 response has error msg in the body. Could you print it?
Check your Cloud usage dashboard - there is panel for errors for log ingestions with their reasons.

Unfortunately, even with trace log level, the response body is not logged. I don’t see any ingest errors in my Cloud usage dashboard? Perhaps because the request was forbidden :confused:

:person_shrugging: I would try another tool for log ingestion on that host, maybe grafana agent, opentelemetry collector,…