I’m using Grafana (Grafana v11.1.0) to monitor measurements of some LoRaWAN IoT sensors.
All works fine.
Now I want to monitor the operation of the sensors themselfs.
Basically the sensors are sending data in a fixed interval (e.g. each 10 minutes a new data package). I would like to analyze the actual data packages received.
Something like data packages / h or even better: percentage of actually received data packages vs expected data packages.
For example (1 datapoint each 10 minutes):
when I have 6 datapoints in the last hour → 100%
when I have 3 datapoints in the last hour → 50%
I thought about the count function, but I think it counts the data points in the selected time frame. Hence there is still some calculation needed (e.g. when the selected time frame is 7 days and I have 748 data points - is that good or bad?)
I’m even not sure if Grafana is the right spot to do this analysis.
The data flow is Sensor → TTN → Telegraf → InfluxDB → Grafana.
Any thoughts and links to related documentation (I tried to find something, but without success) would be appreciated.
I’m using InfluxQL.
Since I’m using InfluxDB V2 afaik SQL is not available. Since Flux seems to be on its way to discontinuation in my mind InfluxQL is currently the only reasonable option.
Flux is much more functional than InfluxQL. Even if it won’t be available in InfluxDB 3.0 etc., I would definitely go for it in InfluxDB 2. InfluxQL is missing lots of functions which are available in Flux.
Can you share some input data (i.e. in CSV) and expected result that you want to achieve?
Actually I’m new to both, Flux and InfluxQL (even more or less SQL), hence my thinking was if I have to learn something new anyhow, then it should last as long as possible. I was not aware that Flux is more versatile than InfluxQL (and still I don’t know if I would need more than InfluxQL). And now, since @grant2 mentioned the Flux plugin the decision is even more complicated
To make it short: If somehow can point out a solution using InfluxQL, I would be happy, if it would require Flux - fine, too!
Back to my question.
I would like to answer that in two levels, since I’m not sure if there isn’t a better implementation to my objective than the one I have in mind.
on a high level I want to monitor how reliable a sensor send its data in the defined interval. I want to have kind of a warning, if a sensor did not send all expected data, let’s say, during the last hour. And I want to have an overview over a longer time period to recognize trends (e.g.always between 8:00 and 10:00 in the morning I loose some data).
my solution to that would be to look back in fixed intervals (related to the sending interval of the respective sensor) one hour and count the received data packages. In the attached example the sensor is sending its readings each 5 minutes. Hence I expect 12 data packages per hour. I would like to look back each 5 minutes on the last hour and count the received data packages. If it 12 “receiving rate” is 100%, if it is 6 the rate is 50%.
Attached csv file
edit: how can I attach a file ? → I will post and then look for a way to share the file.
it’s real data from a temperature sensor
I removed some columns (like device-IDs). I think that doesn’t mater for the example
I deleted some received packages to make up an unreliable receiving of sensor data.
The outcome should be something like that. I only calculated a value each 1 hour, each 5 min would be too much work manually .
exactly the dilemma we found ourselves in so ditching the whole influx product for some other tsdb.
flux is amazing but being abandoned
influxql has very limited functionality compared to flux
no migration path from flux.
I would say that’s correct. I wouldn’t calculate it overall. I would calculate (hardcode logic) on hourly bucket, so so result will be timeseries and you can visualize also when the problem has happened - so you don’t need care about selected time range and it’s dependency on calculation logic.
Or if you don’t want to hardcode logic on hourly basis, then I would use time variables (e.g. __from/__to) to calculate duration of selected timerange and then to calculate how many datapoints should be expected in that timerange (/10m), e.g. meta query: $(__to-$__from)/1000/600)
You are groupping per hour, so you always expect 6 points per hour. This doesn’t scale very well, when you select wider timeranges, e. g. 1year. And users are creative, so they will do that and they will complain that’s very slow.
I did something similar in the past. Calculation of overall avg “refresh rate” of data for each item for last 24 hours - they should have 1,5,10,60, …, so some weird value (e.g. 9.93 min) indicates some issue:
I fully understand that. But it’s still not clear to me what your recommendation about the
“hardcode logic” very practically mean. This term doesn’t mean anything to me .
Maybe it’s me missing some experience in doing this kind of calculations …