I have just started to use Grafana Cloud (currently on the trial version but will switch to free tier) and can see that my current Active Series count is way over the allowed 10k. I performed a quick check on the Prometheus usage (as suggested in Analyzing Prometheus metrics usage with Grafana Explore | Grafana Labs) and can see that the apiserver_request_duration_seconds_bucket metric is very, very high.
My question is how are other users using the free tier when just scraping the Kubernest API Server will put you over the allowed 10k metrics?
Do Grafana have any recommendations on Prometheus configuration, especially Scrape configs, that strike the right blend between useful metrics and not going over the 10k limit?
Thanks @daviddorman - I did look at those guides and will try to work through them but I still think that even after applying the recommendations the number of active series will still be very high.
It would be fantastic if Grafana could produce a Prometheus config that was suitable for use with Grafana Clouds free tier.
There should be enough in there to get started - better docs are coming really soon!
That tool will give you a list of metrics referenced in dashboards and rules file, and using this list of metrics, you can define action: keep for those metrics in your remote_write configuration (whether using the Grafana Agent or Prometheus). You can take the output of the --print flag and plug it directly in your existing config. You’ll still store those metrics locally and will only ship this curated subset to Grafana Cloud, drastically reducing your usage.
For a small cluster using the standard set of k8s-mixin dashboards and rules, your usage should fall well below 10k!
Also, to make all of this even easier we are launching a Kubernetes integration for the Grafana Agent next week, which will set all of this up out of the box