The issue I’m trying to resolve: Grafana server is accessible for hundreds of staff. Sometimes these staff create large dashboards (20+ panels) with complicated queries and set the auto-refresh rate to 10 seconds.
Shortly afterwards either Prometheus/Graphite becomes overloaded and dies.
Looking for a solution to this.
Use nginx rate-limiting on both Prometheus/Graphite and limit requests per IP. Currently, all Grafana requests share the same IP (Grafana) and ideally we could limit per user. Is such a thing possible?
Globally enforce the minimum auto-refresh rate. For example, no dashboard could exist with less than 1 minute auto-refresh rate.
Any advice is much appreciated!