Unable to Send Alerts in Grafana 4.1.1

Hi I am trying to setup alerting in Grafana and have configured as per the documentation. But i not able to receive any alerts.
Attached is how my alerts tab looks like.


Following is the query B:
SELECT count(“request”) FROM “TestTab” WHERE “response” = ‘4XX’ AND $timeFilter GROUP BY time($interval), “response” fill(null)
I am seeing tons of 4XX’s but not able to get any alerts.

Also I am not in a position to upgrade the grafana, so can anyone help out in fixing this issue in v4.1.1 only.

Thanks in advance.

The test rule says firing false , you can expand the logs to see the response from you InfluxDB query

@torkel,
Yes, but when i expand the testRule, is empty.

looks like your query is not returning anything. You can look in Grafana.log for more, try enabling dev mode and info logging

http://docs.grafana.org/alerting/rules/#troubleshooting

@torkel,
below is the log:
t=2017-07-21T20:05:36+0000 lvl=info msg=“Alert Rule returned no data” logger=alerting.resultHandler ruleId=13 name=“ISSUES PRE Copy alert” changing state to=no_data

I am not able to see the data that is sent out by the query in the log, but the data is being displayed in the grafana console.

As alerting is executed on the server (unlike the queries for graph where are executed from your browser and your machine), are you sure that the Grafana server has access to your influxdb? Or is the data source access set to direct in the data source settings?

If it is not one of these 2 problems, then you will have to turn on debugging like Torkel says.

Hi @daniellee,
Is there a way i can confirm your Question.
‘are you sure that the Grafana server has access to your influxdb? Or is the data source access set to direct in the data source settings?’

  1. Can you ping the influxdb server from your Grafana server? Maybe you have an IT department that has set up VPN’s or firewalls that prevent access?
  2. There is a field in the data source settings page called Access. It can have one of two values, direct or proxy. It should probably be set to proxy. See docs.

Hi,

I got the same issue. I’m using Heapster for Kubernetes.
I can send a notification test but not receive any alerts.

The error from Grafana:
lvl=info msg="Alert Rule returned no data" logger=alerting.evalHandler ruleId=1 name="Individual CPU Usage: $namespace $podname alert" changing state to=no_data

  1. Yes I can ping
  2. Access is set to proxy.

Do you have any other suggestions?

Have you tried Torkel’s suggestion and turned on debugging. From the docs:

If it’s not an error or for some reason the log does not say anything you can enable debug logging for some relevant components. This is done in Grafana’s ini config file.

This is the log:

t=2017-07-26T12:58:56+0000 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=1
t=2017-07-26T12:59:01+0000 lvl=dbug msg="Scheduler: Putting job on to exec queue" logger=alerting.scheduler name="Individual CPU Usage: $namespace $podname alert" id=1
t=2017-07-26T12:59:01+0000 lvl=dbug msg="Influxdb request" logger=tsdb.influxdb url="http://monitoring-influxdb:8086/query?db=k8s&epoch=s&q=SELECT+sum%28%22value%22%29+FROM+%22cpu%2Fusage_rate%22+WHERE+%22type%22+%3D+%27pod_container%27+AND+%22namespace_name%22+%3D~+%2F%24namespace%24%2F+AND+%22pod_name%22+%3D~+%2F%24podname%24%2F+AND+time+%3E+now%28%29+-+5m+GROUP+BY+time%28200ms%29%2C+%22container_name%22+fill%28null%29"
t=2017-07-26T12:59:01+0000 lvl=info msg="Alert Rule returned no data" logger=alerting.evalHandler ruleId=1 name="Individual CPU Usage: $namespace $podname alert" changing state to=no_data
t=2017-07-26T12:59:01+0000 lvl=dbug msg="Job Execution completed" logger=alerting.engine timeMs=5.371 alertId=1 name="Individual CPU Usage: $namespace $podname alert" firing=false
t=2017-07-26T12:59:06+0000 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=1
t=2017-07-26T12:59:16+0000 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=1
t=2017-07-26T12:59:26+0000 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=1
t=2017-07-26T12:59:36+0000 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=1
t=2017-07-26T12:59:40+0000 lvl=dbug msg=GetAlerts logger=alerting.extractor
t=2017-07-26T12:59:40+0000 lvl=dbug msg="Extracted alerts from dashboard" logger=alerting.extractor alertCount=1
t=2017-07-26T12:59:40+0000 lvl=dbug msg="Influxdb request" logger=tsdb.influxdb url="http://monitoring-influxdb:8086/query?db=k8s&epoch=s&q=SELECT+sum%28%22value%22%29+FROM+%22cpu%2Fusage_rate%22+WHERE+%22type%22+%3D+%27pod_container%27+AND+%22namespace_name%22+%3D~+%2F%24namespace%24%2F+AND+%22pod_name%22+%3D~+%2F%24podname%24%2F+AND+time+%3E+now%28%29+-+5m+GROUP+BY+time%28200ms%29%2C+%22container_name%22+fill%28null%29"
t=2017-07-26T12:59:40+0000 lvl=info msg="Alert Rule returned no data" logger=alerting.evalHandler ruleId=0 name="Individual CPU Usage: $namespace $podname alert" changing state to=no_data

For the Docker image, you use environmental variables. To turn on debugging, it would be "GF_LOG_LEVEL=debug".

All options defined in conf/grafana.ini can be overriden using environment variables by using the syntax GF__. For example:

docker run \
  -d \
  -p 3000:3000 \
  --name=grafana \
  -e "GF_SERVER_ROOT_URL=http://grafana.server.name" \
  -e "GF_SECURITY_ADMIN_PASSWORD=secret" \
  grafana/grafana

I can’t send another message because of the 3 message limit, so I’m editing my last reply:

t=2017-07-26T12:58:56+0000 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=1
t=2017-07-26T12:59:01+0000 lvl=dbug msg="Scheduler: Putting job on to exec queue" logger=alerting.scheduler name="Individual CPU Usage: $namespace $podname alert" id=1
t=2017-07-26T12:59:01+0000 lvl=dbug msg="Influxdb request" logger=tsdb.influxdb url="http://monitoring-influxdb:8086/query?db=k8s&epoch=s&q=SELECT+sum%28%22value%22%29+FROM+%22cpu%2Fusage_rate%22+WHERE+%22type%22+%3D+%27pod_container%27+AND+%22namespace_name%22+%3D~+%2F%24namespace%24%2F+AND+%22pod_name%22+%3D~+%2F%24podname%24%2F+AND+time+%3E+now%28%29+-+5m+GROUP+BY+time%28200ms%29%2C+%22container_name%22+fill%28null%29"
t=2017-07-26T12:59:01+0000 lvl=info msg="Alert Rule returned no data" logger=alerting.evalHandler ruleId=1 name="Individual CPU Usage: $namespace $podname alert" changing state to=no_data
t=2017-07-26T12:59:01+0000 lvl=dbug msg="Job Execution completed" logger=alerting.engine timeMs=5.371 alertId=1 name="Individual CPU Usage: $namespace $podname alert" firing=false
t=2017-07-26T12:59:06+0000 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=1
t=2017-07-26T12:59:16+0000 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=1
t=2017-07-26T12:59:26+0000 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=1
t=2017-07-26T12:59:36+0000 lvl=dbug msg="Scheduling update" logger=alerting.scheduler ruleCount=1
t=2017-07-26T12:59:40+0000 lvl=dbug msg=GetAlerts logger=alerting.extractor
t=2017-07-26T12:59:40+0000 lvl=dbug msg="Extracted alerts from dashboard" logger=alerting.extractor alertCount=1
t=2017-07-26T12:59:40+0000 lvl=dbug msg="Influxdb request" logger=tsdb.influxdb url="http://monitoring-influxdb:8086/query?db=k8s&epoch=s&q=SELECT+sum%28%22value%22%29+FROM+%22cpu%2Fusage_rate%22+WHERE+%22type%22+%3D+%27pod_container%27+AND+%22namespace_name%22+%3D~+%2F%24namespace%24%2F+AND+%22pod_name%22+%3D~+%2F%24podname%24%2F+AND+time+%3E+now%28%29+-+5m+GROUP+BY+time%28200ms%29%2C+%22container_name%22+fill%28null%29"
t=2017-07-26T12:59:40+0000 lvl=info msg="Alert Rule returned no data" logger=alerting.evalHandler ruleId=0 name="Individual CPU Usage: $namespace $podname alert" changing state to=no_data

I tried running Docker container for two grafana versions - 4.1.1(similar to my current production version) and latest 4.4.1
Both are pointing to the local grafana/grafana.db folder.
When running the debug level logs. I am seeing the following differences in the logs.

grafana:4.1.1


It looks like in 4.1.1 its trying to access influx to get the metrics and getting a 400.

grafana:4.4.1

@torkel @daniellee, is this is know issue in 4.1.1 version of grafana.

No, there are no reported issues like this. You seem to have discovered the problem yourself by looking at the logs, grafana-server does not seem be able to access InfluxDB when evaluating the query, request return 400. Which means probably that your query is invalid, now looking above i can see that you are using template variables in your alert query, this is not supported.

@torkel, this is my query below:


Please suggest, if there is anything wrong.
Below is my alert tab details:

I didn’t understand when you when you said template variables in alert query. Could you show what they are?

Thanks

@ash007 I think Torkel got confused due to you jumping into the middle of a thread started by someone else. Anyway, alerting executes queries on the backend (when you execute a query in the Grafana UI then it is sent from your machine and not the backend) and there seems to be a problem connecting to InfluxDB from the Grafana Server . Could be firewall issues or authentication.

@eddiem21 does your query contain template variables? Alerting does not support queries with template variables.

Hi @daniellee,

I didn’t create the queries it came with Grafana. this is the query:

@eddiem21 There is a big red error message explaining the problem :slight_smile:

Unfortunately it is just too complex (difficult to do good UX for it, difficult to know what to do technically) to support template variables for alerting right now. If you want to alert on this query, then you will have to remove the template variable and hard code the value. Which data source are you using? You can often replace a template variable with an expression (globs or regex) e.g. backend.servers.server_{dev,staging}