Rate limiting and connection timeout issues with Azure Monitor and Azure Plus datasources

  • What Grafana version and what operating system are you using?
    Image - grafana/grafana:9.0.0
    Chart - 6.31.0

  • What are you trying to achieve?
    Azure cost and log analytics dashboards across 50+ dashboards with access from 10+ users.

  • How are you trying to achieve it?
    We have installed Grafana onto an AKS cluster and have configured two types of Datasources: AzurePlus and AzureMonitor. Both of these have been set up with separate App Registrations and then split by each subscription we have.

    • The AzurePlus Datasource has been created via the datasources.yaml in our helm values
    • The AzureMonitor Datasources have been manually created as Datasources
  • What happened?
    Both of these Datasources work successfully, however, after slightly heavier usage (more than 1 user), we see:

    • The AzurePlus Datasource gives us 429 errors
    • The AzureMonitor Datasource times out after 30 seconds and returns us a 400
  • What did you expect to happen?
    We want to be able to support multiple users and alerts contacting our Azure tenant without failed requests.

  • Can you copy/paste the configuration(s) that you are having problems with?
    The below diagram describes our setup and the errors we are seeing:

  • Did you follow any online instructions? If so, what is the URL?
    Azure Monitor data source | Grafana documentation

429 - you are firing too many requests against Azure API.
So decrease them, e. g. don’t refresh dashboards, so often, don’t run alert rules so often,…

Thanks for responding! :heart:
Sure, I understand that for the AzurePlus data source, however, for the query API that is used by the AzureMonitor data source, I don’t believe we’re hitting the documented service limits:
And we’re also getting a 400 Invalid Request response following a 30-second timeout rather than a 429, when just prior to refreshing the dashboard once, I can see a workspace request being made with the following response header:
X-Ms-Ratelimit-Remaining-Subscription-Reads: 11994
is there a way I confirm what is happening with these requests being made via the AzureMonitor Datasource in more detail?

I don’t know. Try standard approach: usw debug log level and check a Grafana server logs.