Grafana Alert Rules and Alert Message Template for Slack

Hi team,
Im using Grafana with Prometheus,

Scenario : To monitor the server resources of CPU, RAM, Disk Space, instance Down or UP status.

Issue I’m facing:

  1. some times in alert rules the status of “Health” is (Nodata), why is this happening, is the alert is incorrect or anything else?
  2. kindly give alert message template with alert message, instance ip and job name, so it will be easy to understand the alert message for my scenario.
  3. Is there anything i need to change in Alert Rules and Alert message template ?
  4. even after changing the alert rules description and summary it’s showing the old values used in summary or description in alerts saying no value as below

[[FIRING x 3 | ] || DatasourceNoData|attachment](http://grafana.staged-by-discourse.com/alerting/list)
**Summary**: High RAM usage on <no value> (<no value>)
**Description**: RAM usage on <no value> (<no value>) is above 90%. ( For the last 1 Minute)
**Instance Details**: ``
**Job Details**: ``
Labels:

I have used the following rules for alerting

  1. CPU
    (A)
    100 - (avg by (instance, job) (irate(node_cpu_seconds_total{mode="idle"}[1m])) * 100)

(B)
Reduce
Function - Last
Input - A
Mode - Strict

(C)
Threshold
input - B
IS ABOVE - 70

  1. Disk Space
    (A)
    100 - (avg(node_filesystem_avail_bytes{job=~".+", instance=~".+"}) by (instance) / avg(node_filesystem_size_bytes{job=~".+", instance=~".+"}) by (instance)) * 100 > 80

(B)
Reduce
Function - Last
Input - A
Mode - Strict

(C)
Threshold
input - B
IS ABOVE - 80

  1. RAM
    (A)
    100 * (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) > 90

(B)
Reduce
Function - Last
Input - A
Mode - Strict

(C)
Threshold
input - B
IS ABOVE - 90

  1. Instance Down status
    (A)
    sum_over_time(up{job=~".+", instance=~".+"}[10s]) < count(up{job=~".+", instance=~".+"})

(B)
Reduce
Function - Last
Input - A
Mode - Strict

(C)
Threshold
input - B
IS ABOVE - 0

For the above 4 rules the Alert Condition is set to (C)

Notification channel is set to Slack

For the Alert Message Template

{{ define "alert_severity_prefix_emoji" -}}
	{{- if ne .Status "firing" -}}
		[OK]
	{{- else if eq .CommonLabels.severity "critical" -}}
		[CRITICAL]
	{{- else if eq .CommonLabels.severity "warning" -}}
		[WARNING]
	{{- end -}}
{{- end -}}

{{ define "slack.title" -}}
	{{ template "alert_severity_prefix_emoji" . }} 
	[{{- .Status | toUpper -}}{{- if eq .Status "firing" }} x {{ .Alerts.Firing | len -}}{{- end }}  | {{ .CommonLabels.env | toUpper -}} ] ||  {{ .CommonLabels.alertname -}}
{{- end -}}

{{- define "slack.text" -}}
{{- range .Alerts -}}
{{ if gt (len .Annotations) 0 }}
*Summary*: {{ .Annotations.summary}}
*Description*: {{ .Annotations.description }}
*Instance Details*: `{{ .Labels.instance }}`
*Job Details*: `{{ .Labels.job }}`
Labels: 
{{ range .Labels.SortedPairs }}{{ if or (eq .Name "env") (eq .Name "instance") (eq .Name "job") }}• {{ .Name }}: `{{ .Value }}`
{{ end }}{{ end }}
{{ end }}
{{ end }}
{{ end }}