I am working with loki stack (promtail, loki, prometheus alertmanager, grafana) and using distributed system to scrape logs and send alert using alertmanager. Let me explain complete scenario
- I have multiple clusters with different applications and promtail agent.
- A common cluster with loki, grafana, prometheus alertmanager.
Currently I am gettting logs from each cluster’s promtail via loki and sending alert to slack using alertmanager.
My problem is How will I know if any of the cluster’s promtail agent fails to scrap logs ?
How can I regularly monitor if my promtails are running fine in each cluster and we are getting logs ?
Any Idea/method is appreciated. Thanks.