False positive for absent_over_time Alert

Hi There,

I’m currently trying to set up alerting based on loki logs.
I have a cronjob, that creates Database Backups every day. The backup agent reports success in the logs.

My goal is to be notified, if this message is missing, so I came up with this Rule:

- name: app-db-backup
    rules:
      - alert: appDbNoBackup
        expr: absent_over_time(({app="percona-server-mongodb", container="backup-agent", pod=~"app-db.*"} != "[pitr]" |= "backup finished")[24h])
        for: 2h
        labels:
          severity: critical
        annotations:
          summary: No Backups have been finished within 24h

Running this Query via Grafana generates this Graph:

Running the above Query without absent_over_time grants these logs:

2021-11-21 19:00:45	
{"log":"2021-11-21T18:00:44.000+0000 I [backup/2021-11-21T18:00:15Z] backup finished\n","stream":"stderr","time":"2021-11-21T18:00:44.881235862Z"}
2021-11-20 19:00:44	
{"log":"2021-11-20T18:00:44.000+0000 I [backup/2021-11-20T18:00:15Z] backup finished\n","stream":"stderr","time":"2021-11-20T18:00:44.84431218Z"}
2021-11-19 19:00:42	
{"log":"2021-11-19T18:00:42.000+0000 I [backup/2021-11-19T18:00:14Z] backup finished\n","stream":"stderr","time":"2021-11-19T18:00:42.845762133Z"}
2021-11-18 19:00:45	
{"log":"2021-11-18T18:00:44.000+0000 I [backup/2021-11-18T18:00:14Z] backup finished\n","stream":"stderr","time":"2021-11-18T18:00:44.872780481Z"}
2021-11-18 09:01:34	
{"log":"2021-11-18T08:01:34.000+0000 I [backup/2021-11-18T08:01:04Z] backup finished\n","stream":"stderr","time":"2021-11-18T08:01:34.580555628Z"}
2021-11-17 19:00:43	
{"log":"2021-11-17T18:00:43.000+0000 I [backup/2021-11-17T18:00:14Z] backup finished\n","stream":"stderr","time":"2021-11-17T18:00:43.792733085Z"}
2021-11-17 16:20:48	
{"log":"2021-11-17T15:20:48.000+0000 I [backup/2021-11-17T15:20:18Z] backup finished\n","stream":"stderr","time":"2021-11-17T15:20:48.061940266Z"}

However, the alert is triggered 3.5h after the last successful backup and remains until the next backup is created.
Meaning, the Alert fires every day from 21:30 to 18:00 the next day.

Any idea, why this is happening? The vector for absent_over_time is set to 24h, so this alert should not trigger, as long as messages appear within 24h. Looking at the graph, I can’t see any indication why this alert is behaving as it is.

Any help would be greatly appreciated.
-Markus

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.