What Grafana version and what operating system are you using?
grafana 11.3.0 on Ubuntu 24.04
What are you trying to achieve?
- I want to have a single alert that works with multi-dimension data. Eg, various host use telegraf to send disk metrics into InfluxDB. I use InfluxQL to query for
used_percent
withGROUP BY host,path
then use a Reduce expression to convert from time-series into something that can by used by Threshold in the alert.
How are you trying to achieve it?
When setting up the alert, use:
- A: Query:
SELECT host,path,used_percent FROM "disk" WHERE $timeFilter GROUP BY host,path
formatted as time-series - B: Expression: Reduce: function=Last, input=A, Mode=strict (I tried different functions and modes and always the same result
- C: Expression: Threshold: input=B, is above=80
What happened?
In terms of the alerting, this works perfectly as seen with Preview and when alerts fire. However, whenever I click ‘Preview’ or when an alert fires, I receive many log messages of the form:
Nov 9 20:35:24 <host> grafana[18411]: logger=expr t=2024-11-09T20:35:24.311643923-06:00 level=warn msg="Ignoring InfluxDB data frame due to missing numeric fields"
Nov 9 20:35:24 <host> grafana[18411]: logger=expr t=2024-11-09T20:35:24.311665412-06:00 level=warn msg="Ignoring InfluxDB data frame due to missing numeric fields"
Nov 9 20:35:24 <host> grafana[18411]: logger=expr t=2024-11-09T20:35:24.311676404-06:00 level=warn msg="Ignoring InfluxDB data frame due to missing numeric fields"
What did you expect to happen?
For it to function the same as it is but without the spurious messages.
Can you copy/paste the configuration(s) that you are having problems with?
I can reproduce this if creating a new dashboard cell (the Explorer doesn’t work since it doesn’t support expressions. Alerting doesn’t have a way to inspect (it would be nice if it did).
First, set query ‘A’ like above: SELECT host,path,used_percent FROM "disk" WHERE $timeFilter GROUP BY host,path
. Using the query inspector, I can see various series:
dist.used_percent { host: foo, path: /path1 }
(with multiple rows of timestamp and value as float)dist.used_percent { host: foo, path: /path2 }
(with multiple rows of timestamp and value as float)dist.used_percent { host: bar, path: /path1 }
(with multiple rows of timestamp and value as float)dist.used_percent { host: bar, path: /path2 }
(with multiple rows of timestamp and value as float)...
Only having ‘A’ doesn’t trigger the issue.
Next add an expression ‘B’ like above: function=Last, input=A, Mode=strict. This causes the spurious log messages. Using the query inspector, I now see the additional series for ‘B’:
dist.used_percent { host: foo, path: /path1 }
(with multiple rows of timestamp and value as float)dist.used_percent { host: foo, path: /path2 }
(with multiple rows of timestamp and value as float)dist.used_percent { host: bar, path: /path1 }
(with multiple rows of timestamp and value as float)dist.used_percent { host: bar, path: /path2 }
(with multiple rows of timestamp and value as float)...
B { host: foo, path: /path1 }
(with single value as float (the lastused_percent
))B { host: foo, path: /path2 }
(with single value as float (the lastused_percent
))B { host: bar, path: /path1 }
(with single value as float (the lastused_percent
))B { host: bar, path: /path2 }
(with single value as float (the lastused_percent
))B ...
I believe this is enough to show the issue. In my case, I also added an additional expression ‘C’, like above: Threshold: input=B, is above=80. This causes additional spurious messages for ‘C’ in addition to ones for ‘B’ (it seems to be the sum of the number of series from the results of the different expressions). Using the query inspector, I now see the additional series for ‘C’:
dist.used_percent { host: foo, path: /path1 }
(with multiple rows of timestamp and value as float)dist.used_percent { host: foo, path: /path2 }
(with multiple rows of timestamp and value as float)dist.used_percent { host: bar, path: /path1 }
(with multiple rows of timestamp and value as float)dist.used_percent { host: bar, path: /path2 }
(with multiple rows of timestamp and value as float)...
B { host: foo, path: /path1 }
(with single value as float (the lastused_percent
))B { host: foo, path: /path2 }
(with single value as float (the lastused_percent
))B { host: bar, path: /path1 }
(with single value as float (the lastused_percent
))B { host: bar, path: /path2 }
(with single value as float (the lastused_percent
))B ...
C { host: foo, path: /path1 }
(with single value as int (0 or 1))C { host: foo, path: /path2 }
(with single value as int (0 or 1))C { host: bar, path: /path1 }
(with single value as int (0 or 1))C { host: bar, path: /path2 }
(with single value as int (0 or 1))C ...
As mentioned, the functionality is doing exactly what I want. I don’t know why it is complaining about missing numeric fields as they are present (I even tried 'Drop non-numeric values to no avail (which makes sense since used_percent
is consistently numeric in the query inspector)).
Lastly, the schema of the InfluxDB measurement for disk
is:
$ curl ... 'q=SHOW TAG KEYS FROM "disk"'
name,tags,tagKey
disk,,device
disk,,fstype
disk,,host
disk,,label
disk,,mode
disk,,path
$ curl ... 'q=SHOW FIELD KEYS FROM "disk"'
name,tags,fieldKey,fieldType
disk,,free,integer
disk,,inodes_free,integer
disk,,inodes_total,integer
disk,,inodes_used,integer
disk,,inodes_used_percent,float
disk,,total,integer
disk,,used,integer
disk,,used_percent,float
Thanks!