Calculating number of received and missing data points for time interval

celeste · December 4, 2024, 9:57pm

What Grafana version and what operating system are you using?
Grafana cloud (steady) (This is what appears when I hover over the “?” icon)
MacOS Sonoma 14.5
What are you trying to achieve?
Calculate and make a histogram and table of %data received. Our devices sometimes go offline and I want to know how much they are online.
How are you trying to achieve it?
Using flux “exists”
What happened?
My count_nan isn’t returning the number of data points that don’t exist but my count_exists is working. If I divide count_exists by the number of points I expect, it correctly give me a percentage. But count_nan is 0.
What did you expect to happen?
I expect count_nan to equal missed points.
Can you copy/paste the configuration(s) that you are having problems with?

|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r._field == “temp”) //

// Calculate counts of non-NaN and NaN values
|> reduce(
identity: {count_exists: 0, count_nan: 0}, // Initialize as integers
fn: (r, accumulator) => ({
count_exists: if exists r._value then accumulator.count_exists + 1 else accumulator.count_exists,
count_nan: if exists r._value then accumulator.count_nan else accumulator.count_nan + 1 // Increment count_nan if _value is missing
})
)

// Calculate uptime percentage and add device_id as _time
|> map(fn: (r) => ({
r with
test: float(v: r.count_exists)/ (float(v: r.count_exists)+float(v: r.count_nan)),
_time: r.device_id // Replace _time with device_id for display
}))
|> keep(columns: [“_time”, “_field”, “_value”, “test”]

// Pivot data using uptime as value
|> pivot(rowKey: [“_time”], columnKey: [“_field”], valueColumn: “test”)
|> rename(columns: {_time: “device_id”, temp: “uptime”})
|> yield(name: “uptime”)

Did you receive any errors in the Grafana UI or in related logs? If so, please tell us exactly what they were.
There are no errors for this. Just no data shows.
Did you follow any online instructions? If so, what is the URL?
I’ve just been using the flux documentation.

grant2 · December 5, 2024, 2:11am

Hi @celeste and welcome to the Grafana forum.

Your question might better be suited for the InfluxDB forum, but just to offer some potentially helpful advice…

Are you sure you want a histogram to display the number of received and missing data points for a given time interval? Isn’t the result just going to be two integers (e.g 4582 received, 535 missing)?

Are your values always 1 or 0? If yes, then this solution should work:

or maybe this one?

Either of the above might even be adaptable to include NaN values.

@ldrascic @fercasjr @ebabeshko are all amazingly talented and might be able to help out here.

celeste · December 5, 2024, 6:39pm

Hi Grant! Thank you. I see, I will check out Influx forum too.

My devices are connected to wifi and sometimes data won’t come in because of a wifi problem. I want to understand how many data points I have vs. expected. For example, if I expect 6 data points every minute and only get 5, that would be 83%.
I’d like a histogram as seen as the example screenshot.
The x-axis is between 0 and 100% and I can see how all my devices are doing on connectivity.
(Eventually, I want to be able to click on the bar and gather the device_ids in each segment to trouble shoot the problematic low-connected ones).

What do you think about that? I will check out your resources too.

celeste · December 5, 2024, 10:46pm

I figured it out. About to post the code - I know our sampling rate and how many data points we should have so I just used that to calculate loss of data.

celeste · December 5, 2024, 10:50pm

collectionInterval = 60s
pointsPerMinute = float(v: 1) // Ensure this is a float for consistency

data = from(bucket: “bucket”)
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r._field == “data”)

// Calculate counts of non-NaN values and expected points
|> reduce(
identity: {count_exists: 0.0, count_expected: 0.0}, // Initialize fields as floats
fn: (r, accumulator) => ({
count_exists: if exists r._value then accumulator.count_exists + 1.0 else accumulator.count_exists,
count_expected: float(v: (uint(v: v.timeRangeStop) - uint(v: v.timeRangeStart))) / 60000000000.0 * pointsPerMinute // Total expected points
})
)

// Calculate uptime percentage and add device_id as _time
|> map(fn: (r) => ({
r with
test: (float(v: r.count_exists) / float(v: r.count_expected)) * 100.0, // Direct float division
//test: r.count_exists,
_time: r.device_id // Replace _time with device_id for display
}))
|> keep(columns: [“_time”, “_field”, “test”]) // Keep only relevant columns

// Pivot data using uptime as value
|> pivot(rowKey: [“_time”], columnKey: [“_field”], valueColumn: “test”)
|> rename(columns: {_time: “device_id”, data: “uptime”})

|> yield(name: “uptime”)

*Pivot is still a little confusing to me but this works

Topic		Replies	Views
How to do count queries with InfluxDB/Grafana/FLUX? Grafana	0	2492	August 25, 2020
Influxdb Data availability dashboard InfluxDB flux	1	519	July 13, 2023
Calculate the difference between data timestamp and recorded timestamp Dashboards	2	464	March 12, 2023
Confusion with InfluxDB/Grafana Data OSS Support	4	1059	July 29, 2020
TOTAL sometimes NaN due to 2 different timestamps(?) how to make sure always calculated InfluxDB	2	906	December 22, 2021

Calculating number of received and missing data points for time interval

Related topics