I’ve been testing Grafana alerts, and while I don’t find it to be ideal, it’s generally okay-ish. However I noticed under certain conditions it completely misses log lines it should be alerting on. For testing purposes, I’m running a small script to produce a Windows event log line every 15 seconds. I set evaluation time to the minimum possible of 10 seconds, and pending period at 0 seconds. This is my query:
count_over_time({hostname="windows"} | json | source = `CustomScriptSource` [10s])
With an Expression Input A is above 0
.
What happens is the alert rule detects only every second instance of the log line produced by the script. So instead of firing 4 times in a minute, it fires twice. I checked the logs through Explore with the same query (minus the count), and found the logs to be recorded properly.
Am I doing something wrong here, is this a limitation of Grafana Alerts, or is it a bug?