Mimir: Handling Data Failures / Migrations / Recovery

I didn’t know which tags to use for this post. It looks like Mimir hasn’t been added to this community yet, but I do think this the right place for this? I hope?

My questions are primarily around data storage in a Kubernetes environment. If one block storage instance goes down, or storage gets corrupted, etc, does the storage layer auto-heal, by syncing data around to make more replicas of a given piece of data? How is the status of cluster storage monitored?

Does mimir support dynamic scaling of the storage backend via config map updates? It seems like it does.

Overall I’m really interested in Mimir, having hit a vertical scaling tipping point with Prometheus.

  • What are you trying to achieve?
    Horizontally scalable metric storage that can withstand periodic node failures.

welcome to the :grafana: community @elliott!

The mimir team has set up a discussion page in github to offer help and support: