[Feature Request] Unpausing an Alert causes unwanted pollution in State History

Grafana 4.2.0, mainly using InfluxDB as datasource.

Unpausing an Alert causes unwanted pollution in State History.

For maintenance purposes, we do a daily restart of InfluxDB.
We have scripted it, to pause all Grafana alerts, restart the influxdb service, and unpause most of the Grafana alerts (see below).

At the state history of each and every alert, we get a daily ‘OK’, which is pollution to us. We only want to see real changes in state at the State History. OK to OK is no change.

Bash script:

python msmetrics_grafana_pause_all_notifications.py msmetrics_grafana_pause_all_notifications__influxdb-live-01.json pause
systemctl restart influxdb --full --no-pager
sleep 5 # Just to give some room
SYSTEMD_COLORS=0 systemctl status influxdb --full --no-pager --lines 0
python msmetrics_grafana_pause_all_notifications.py msmetrics_grafana_pause_all_notifications__influxdb-live-01.json unpause

Python script (stripped all plumbing):

  def PauseNotifications(self,pause_it=True):
    for alert in alerts:
      if (not pause_it) and re.match(pattern=self.tConfig['pause_but_no_unpause'], string=alert['name']):

Any solution would work for us, for example:

  1. make it a Grafana config, like alertstate_nonewstate_at_unpause, or
  2. add it to the Rest api: pause.create(paused=False,record_only_pure_state_changes=True)
  3. a non configurable default
  4. a More states toggle in the Alert State view (so storing like it is now,but only addressing the display)

Cheers, TW

pausing is changing the state, to fix this you will have to wait for the alerting silence feature (hopefully implemented sometime after summer)

Hi Torkel,

Changing the state: Technical perspective <-> End user perspective.
And your answer doesn’t address proposed solution nr 4.

But we’ll wait for the alerting silence feature.
I read from your words that our scenario (above, bash+python+restart influxdb+grafana view) is covered by this new feature, if we will use the future ‘silencing’ api instead of the current ‘pausing’ api. Is this correct?

Cheers, TW