Throwing this error repeatedly on all the Loki write pods on the cluster, and they all show as progressing in argo and are showing as Unhealthy with the message Readiness probe failed: HTTP probe failed with statuscode: 503.
level=error ts=2023-08-31T15:06:02.262372375Z caller=flush.go:144 org_id=1 msg="failed to flush" err="failed to flush chunks: store put chunk: NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors, num_chunks: 1, labels: {app=\"metrics-server\", container=\"metrics-server-vpa\", filename=\"/var/log/pods/kube-system_metrics-server-5955767688-jxx2w_461cedbf-3f06-4e90-8b83-ff1808994420/metrics-server-vpa/0.log\", job=\"kube-system/metrics-server\", namespace=\"kube-system\", node_name=\"aks-statsgrdev-27419674-vmss000003\", pod=\"metrics-server-5955767688-jxx2w\", stream=\"stderr\"}"
level=info ts=2023-08-31T15:06:02.262398775Z caller=flush.go:168 msg="flushing stream" user=1 fp=73c7a2b302997943 immediate=true num_chunks=19 labels="{app=\"loki\", component=\"backend\", container=\"loki\", filename=\"/var/log/pods/statsgrafana_loki-backend-2_acaf0b8f-ddee-4a9c-84bc-d1ddd8577f65/loki/0.log\", instance=\"statsgrafana-dev\", job=\"statsgrafana/loki\", namespace=\"statsgrafana\", node_name=\"aks-statsgrdev-27419674-vmss000005\", pod=\"loki-backend-2\", stream=\"stderr\"}"
Here’s the values file being passed to the helm chart:
image:
pullPolicy: Always
tags: "dev-latest"
schema_config:
configs:
- from: "2022-01-11"
index:
period: 24h
prefix: index_
object_store: azure
schema: v12
store: boltdb-shipper
azure:
# Your Azure storage account name
account_name: <account name>
# For the account-key, see docs: https://docs.microsoft.com/en-us/azure/storage/common/storage-account-keys-manage?tabs=azure-portal
account_key: <account key>
# See https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction#containers
container_name: <container name>
use_managed_identity: false
# Providing a user assigned ID will override use_managed_identity
# user_assigned_id: <user-assigned-identity-id>
request_timeout: 0
# Configure this if you are using private azure cloud like azure stack hub and will use this endpoint suffix to compose container & blob storage URL. Ex: https://account_name.endpoint_suffix/container_name/blob_name
endpoint_suffix: blob.core.windows.net
boltdb_shipper:
active_index_directory: /data/loki/boltdb-shipper-active
cache_location: /data/loki/boltdb-shipper-cache
cache_ttl: 24h
shared_store: azure
filesystem:
directory: /data/loki/chunks
The application yaml from argo is here:
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: statsgrafana-app
namespace: argocd
spec:
generators:
- list:
elements:
- name: statsgrafana-dev
cluster: statsgr-aks-dev
valueFiles: values/dev.yaml
grafanaVer: '6.58.4'
lokiVer: '5.8.9'
promtailVer: '6.11.6'
template:
metadata:
name: '{{name}}'
spec:
project: <my argo project name> # The Argo Project this will be defined under.
# syncPolicy:
# automated: {}
sources:
- repoURL: "https://grafana.github.io/helm-charts"
targetRevision: 6.58.4
chart: grafana
syncOptions:
# Need server side apply because the resource is too big to fit in 262144 bytes allows annotation size. May be fixed by a future version of the chart
- ServerSideApply=true
helm:
valueFiles:
- $grafanavalues/helm-charts/grafana/{{valueFiles}}
- repoURL: <my azure repo address>
targetRevision: 'main'
ref: grafanavalues
- repoURL: "https://grafana.github.io/helm-charts"
targetRevision: 5.8.9
chart: loki
helm:
valueFiles:
- $lokivalues/helm-charts/loki/{{valueFiles}}
- repoURL: <my azure repo address>
targetRevision: 'main'
ref: lokivalues
- repoURL: "https://grafana.github.io/helm-charts"
targetRevision: 6.11.6
chart: promtail
helm:
valueFiles:
- $promtailvalues/helm-charts/promtail/{{valueFiles}}
- repoURL: <my azure repo address>
targetRevision: 'main'
ref: promtailvalues
destination:
name: '{{cluster}}'
#server: https://kubernetes.default.svc
namespace: 'statsgrafana'