InvocationsFailedToBeSentToDlq metrics are only sent when errors occur.
Each metric represents a lost message which may need to be investigated (depending on the application).
So I’d like the Alert state to remain in the Alarm state until I manually signal that the situation is resolved.
I can pause an alert, and then unpause it, but then the alert goes into the “Unknown” state.
I can’t set “If no data or all values are null” to “Ok”, because that will clear the alert state before I’ve remediated the lost message(s).
Trying to find out from AWS support how to handle this in CloudWatch Alarms as well.
Note: community.grafana.com won’t let me use InvocationsFailedToBeSentToDlq in the title because “Title seems unclear, one or more words is very long?”