AWS Cloudwatch Metric Alerts

  • What Grafana version and what operating system are you using? 11.6

  • What are you trying to achieve? Alert when ECS service CPU utilization goes too high for specific service

Hello,
My team is working on creating dashboards and alerts for our AWS ECS services using Cloudwatch Metrics. We have an ECS cluster with multiple services and we would like to filter out certain services by their name. Our service names are required to have the deployment date added as a suffix, so the name changes for every deployment. Unfortunately, cloudwatch metrics queries do not support regex or prefix based string matches, only exact matches ( Query components and syntax in CloudWatch Metrics Insights - Amazon CloudWatch ).

We are able to get around this with our dashboard by adding a transformation after the query and filtering based on the service name. It doesn’t appear that the same is possible with alerts. Has anyone faced this issue and is there any workaround purely on the Grafana side? Our AWS setup is not very fluid, so the best solution would be through grafana.

Thanks

CloudWatch metric search expressions do not support arbitrary regular expressions on metric dimensions. If service names are deployment-specific (for example, payment-service-2024-03-22), queries that depend on exact dimension values will break after deployments unless stable dimensions or CloudWatch SEARCH expressions are used.

Use Prometheus/Grafana regex label matching
Instead of matching exact service names, use regex in your PromQL query:
promql

max by (service) (
  ecs_service_cpu_utilization{service=~"payment-service-.*"}
)

service=~“payment-service-.*” → matches ALL payment services regardless of date suffix
max by (service) → returns the highest CPU per service so each service gets its own alert

Alert rule setup:
Query: max by (service) (ecs_service_cpu_utilization{service=~"payment-service-.*"})
Expression: Threshold → IS ABOVE 80
Evaluate every 1m

Am I able to do a join with a CSV using a regex/prefix expression?

How can I run PromQL queries against the cloudwatch data source. Is this possible in Grafana?

Yes, e.g. for Lambda metric with FunctionName dimension:

Hey,

Thanks for the prompt reply. I’m not quite understanding if I need to map against ServiceName “app-name-” and I want anything with “app-name-*” prefix, how can I express that using joins against a csv file?

This was an idea,not copy&paste solution. It gives you idea how can you filter CloudWatch dimensions.
You need to build B query, which will have all your desired ecs services names. It does not need to be csv. Csv was used just for simplicity. I guess you need something dynamic, not static list. You need to be creative with that B query, so it will fit your needs.

Another idea will be to use SQL expression.