Is there a way to prevent the issue that occurs in Loki, particularly with the Loki distributor? The issue arises when the distributor is unable to push logs to the ingester due to unhealthy ingesters in the ring. These unhealthy ingesters are not visible using kubectl
commands unless you port-forward the distributor pod and check the /ring
endpoint. We are encountering this problem quite frequently now.
We are using helm chart version 6.12
same issue here; we’re using the Helm chart loki-6.28.0 (app versoin 3.4.2), 3 replicas of the distributor with maxUnavailable=2, and 3 replicas of the ingester without zoneAwareReplication and our backend is S3.
There are no restarts or errors in ingesters (which the distributors supposedly consider unhealthy). What would be the cause here?
In Loki Distributor outbound traffic to ingester drop - Grafana Loki - Grafana Labs Community Forums, @tonyswumac’s questions suggest it could be a resource dimensioning issue?
Are you saying that your ring membership contains members that are no longer healthy? If so, you can try setting autoforget_unhealthy
.