We are so much relying on Grafana for our entire infrastructure monitoring. But suppose, if something happens to our grafana host, how we can overcome this situation.
- Suppose data not coming to any particular data-source(As we are using multiple data-sources).
- Grafana host goes down. We don’t receive alert in this case.
- And many more…??
Do you have a detailed written disaster recovery plan?
No we don’t have any such plan.
We have all our machines/servers monitored using Grafana. We use to receive alerts whenever something disaster happens.
Now I am confused in a scenario, If something happens to Grafana host what should we do.
If Grafana is the police, who will police the police?
Suppose, Grafana goes down due to X reason & it can’t send any alert, our monitoring team will stay normal because they suppose to respond in case of alerts only but they won’t receive any alert?
So how to overcome this situation. How to police the police?