Max function (flux) returns incorrect value, always 74.4

Hello all, I am trying to create a simple graph showing the maximum value for each day (temperature).

Here is my code block:

from(bucket: "Vantage")
  |> range(start: -360d, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "Vantage")
  |> filter(fn: (r) => r["_field"] == "tempout")
  |> filter(fn: (r) => r["identity"] == "WiFiLogger")
  |> aggregateWindow(every: 24h, fn: max, createEmpty: false,)
  |> yield(name:"max")

However, the chart looks like this, with (seemingly) random spikes in the return value, which is always 74.4:

The data explorer in influxdb shows the same issue, here with the “last()” function graphed alongside it:

As you can see, those 74.4 values are not in the original bucket, and no matter what size I set the aggregatewindow() function to (24h… 5m… 2h… whatever), there is always a data point with a value of 74.4 included in the results (only with the max() function, not with the last() function).

I’ve tried changing the timezone of the function and offsetting the query’s time period - neither one works.

Is there anything I can do in grafana to eliminate these errant spikes? Am I building this query incorrectly? Why does the max() function return incorrect values?

Thank you for the help!

Hi @nar1117

I have to admit, this is perplexing. The 74.4 value has to be coming from somewhere in your data, even though it’s not appearing in the “last” query that you showed.

What about if you query a specific time frame (say 1 day) simply do the max() function in InfluxDB. Does it show 74.4? When you expand to 2 days, does it show 74.4?

EDIT: In the above, do not use the aggregateWindow() function at all.
EDIT2: In InfluxDB, toggle this so it will display 1 row showing the timestamp and max value (instead of a graph with one single point)
image

Am sure there is a logical explanation…just may require some digging.

Hey, thanks for taking a look!

I tried what you suggested, and the query returns a single row with the 74.4 value:

The smallest window period I can sample in Influx and also return the max result is 15 seconds, and the 74.4 value is always at the same point in time:

For context, I’m using NodeRed to inject the data into the database. Here’s an example of the JSON sent to influx:

Any ideas on where else I could poke around to troubleshoot? I don’t have any other issues with these data or displaying them in Grafana. Only the max() function is giving me weird issues like this.

EDIT: One clue I just noticed… the errant value always comes at the end of an empty block of time. The messages coming from the weather station are MQTT messages, sent roughly every 10-15 seconds. It looks like, every now and then, there are 6-7 minute gaps in the data entries. The first entry after every gap results in the max() function returning the 74.4 value. I’m not sure if that’s a connection problem or somewhere in the network path (MQTT over wifi → NodeRED → influxDB), but it looks like the gap in the data causes the spike in the data.

Can you post your Node-RED flow (or just send to me as a private message)? I also send my data to InfluxDB using NR and am thinking there is a structural issue in the message that gets passed into Influx as 74.4

Sure thing! It’s 4 nodes:

  • MQTT collection
  • Prepare Fields (changes all the data types from javascript strings to numbers)
  • Prepare Fields (Filters out the data I don’t need and reformats with better names)
  • Send to InfluxDB

Here’s an example of the raw JSON data from the MQTT message:
MQTT example output

You can see that the “tempout” and other numbers in red are in strings. But I change them to numbers in the next node:

Output example from that second node (note: values that were previously strings are now numbers):
Output node 1

And then make them easier to use in the final node before sending to InfluxDB:

My hunch is that the problem is with influxDB, not with the ingestion types or format. If the ingestion were the issue, the 74.4 value would show up in other functions, like “last()” or “mean”, but it only happens with max(). And I’m not sure if you saw it (I might have edited right after you posted your last comment), but the 74.4 value always shows up after a 6-7 minute period of time where the database doesn’t have any data at all. I’m not sure why that empty period keeps happening, and still no idea why it’s always the same amount of time. It could be the wifi, or it could be somewhere else along in the network path for the messages. Maybe that’s a different problem, but again, it looks to me like it’s an influxDB problem or bug.

What version of influx and grafana?
Maybe try a reboot of both?

Grafana is v9.3.1, influxDB is the latest v2.6.1.

They’re both running as docker containers on unraid, and get backed up and rebooted every week.

@nar1117

Which of these InfluxDB nodes are you using in your flow?
image

It looks from your comments above that you are using the influxdb out node, but please confirm.

You could also put a function node just before sending to InfluxDB to flag this 74.4 value (just make it broad enough to catch anything > 50, for example).

PS: I still think this value is coming from your data and there is not a bug in InfluxDB, but let’s keep trying to find out.

Thanks again for your help!

Yes, just the standard influxdb-out node.

Thanks for the code snippet - I just added it and made sure it’s working.

debug node

Tangential question - what’s the best way to send a notification outside of NR? I have another instance of NR running in home assistant, which has the ios notification HA nodes available. This particular instance of NR is kind of sequestered so it doesn’t get bothered too much because I only use it for passing data to influx. So I’m not sure what the best method for exterior notifications would be.

Re: the 74.4 value that you are trying to catch, was it showing up every hour or so? The reason I ask is that (unless you are watching the debug pane constantly), the value may be displayed in the debug pane and then written over since I think the debug pane displays a finite number of messages (maybe 50?). So just try to be there when it happens, if that is possible.

Second, in the dataset which you previously posted, I am trying to understand why the dataset starts with 9:00 then jumps to 9:07 (followed by 9:08, 9:09, etc.). Are there any readings from 9:01 to 9:06?
image

Re: your tangential question, are you looking for Node-RED to send a notification when something happens? I use it to send webhooks to Slack and emails as well. I can elaborate more if that is indeed what you are after.

@grant2

Ok, I figured some of this out. It’s not solved but I’m getting closer.

The 7-minute blackout periods in the data are caused by my 3-day-per week backup schedule for the docker containers. The docker containers are shut down at 3am, backed up, and rebooted.

The 74.4 value always shows up right after influxDB starts up again, but not after EVERY restart. Just some of them. The 74.4 value is not just part of the max() function, it is in the raw data as well.

For example, here’s a 30-day shot showing the random spikes. Note that they are not in a perfect pattern (i.e. the backups still happened, but for whatever reason influx didn’t store a 74.4 value):

So, there seems to be some kind of conflict with the way that the docker container starts up. Maybe there is an error in the way NR handles that kind of error (trying to send data but getting no response), or maybe there is an error on the influxdb side.

Is Node-RED running in the same docker container that gets rebooted, i.e. does Node-RED also restart?

Node-red also gets rebooted! Yes, thank you for that question. Of course it does. The script shuts down all containers, then backs up the docker image, then reboots the containers.

Not sure if this will make a difference, but could you run Node-RED on another device that does not get rebooted and see if that makes any difference? What should happen is Node-RED will keep trying to send values to Influx, but it will not be able to because it’s being shutdown, backed up and rebooted. So it will keep trying until InfluxDB is back online.

Another idea I had (but have never used myself) is this node:

Great, yes that’s a good idea.

Thanks again for helping me troubleshoot this. I would not have thought to consider the backups as an issue unless we had looked into the whole problem.

So, this is obviously not a grafana problem. It could be a NR problem, or an influx problem, or not a problem at all, just a result of the way these data are being passed.

I think I’ll first move the influx-db entry flow to another node-red instance that doesn’t get backed up and see if that helps narrow the problem.

I’m going to leave this thread as un-solved for now (because it technically isn’t solved :smiley: ), but I’ll come back to it in a few days to update once I have more information.

Seriously, thank you again for the help!

FWIW, I use Node-RED → InfluxDB (same node as you) → Grafana extensively (like 30+ computers running different flows, all 24 x 7 for several years) and have never seen something like this. The big difference is that I do not use docker containers, and I do my backups using cron jobs or whatever (so nothing ever really stops).

@grant2

Ha, thanks! That’s good. I don’t know if that makes me feel better or worse…