Hi guys. We have been running alloy on our fleet for a few months now but today a few of the devices started to give “non-recoverable error” due to “invalid authentication credentials”, which caused all promethus ingestion to halt for that device.
- There has been no changes to the access policy.
- There has been no changes to remotecfg, where we manage our alloy fleet
- Only a few of the device are having this issue.
- A systemctl restart alloy fixes the issue
My question is, what could the potential cause of this? if its a setting issue, what can i check to prevent this in the future?
Detail:
**ts=**2026-01-14T19:26:19.34068128Z **level=**error msg=“non-recoverable error” component_path=/remotecfg/TC.default **component_id=**prometheus.remote_write.metrics_service **subcomponent=**rw **remote_name=**9ce832 **url=**https://prometheus-prod-56-prod-us-east-2.grafana.net/api/prom/push **failedSampleCount=**412 **failedHistogramCount=**0 **failedExemplarCount=**0 err=“server returned HTTP status 401 Unauthorized: {\“status\”:\“error\”,\“error\”:\“authentication error: invalid authentication credentials\”}”