Speed up long running Flux Query

Hello guys

I recently had a discussion here (https://community.grafana.com/t/power-consumption-per-day-with-non-equidistant-data-points/76406/8) that had has a result the following query:

from(bucket: "loxone/autogen")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "energie2")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["category"] == "Energie")
  |> filter(fn: (r) => r["name"] == "Verbrauchszähler")
  |> elapsed(unit: 1s)
  |> map(fn: (r) => ({ r with elapsed: float(v: r.elapsed)}))
  |> map(fn: (r) => ({ r with power_kWh: r._value * r.elapsed / 3600.0 }))
  |> aggregateWindow(every: 1d, fn: sum, column: "power_kWh" )
  |> drop(columns: ["_start", "_stop","category", "name", "room", "state", "type", "_field", "_measurement"])

The reason behind is, to calculate energy on certain time ranges if the observed power is stored with not equidistant time values.
In general it is working well. At least for short time ranges.
But if I want to evaluate a greater time range this is very! slow.

I am assumeing due to the two map functions and the calculations needed.
Is there a way to speed this up?
I did had a quick look here (https://docs.influxdata.com/influxdb/cloud/query-data/optimize-queries/) but does not understand how to avoid doeing the calculations (the map functions).

What I can think of is that my grafana panel in general requests the same functions more than one time.
E.g. It has a per day consumption panel and a per month consumption panel (and so on).
Is it possible to build up a cache automatically?

My data is written into the underlying influxdb via node red and I can also think of a data transformation to store the “time between two datapoints” * power / 3600 calculation directly into a separate tabel.

But maybe there is an intelligent solution to speed up the query?

1 Like

Which of your data points are fields and which ones are tags

The only field is _value (except for the time stuff like _time, _start, _stop).
An example of the rawdata looks like this:

Therefore most of the tabel consists of tags …

does this help?

Can you elaborate a bit, e.g. does the Grafana panel display in < 5 seconds when your time range is 1 day or less? How long does it take when you change it to 2 days? 7 days? 30 days?

Sure but to get an information from this one should correlate this to the amount of datapoints. Otherwise the information about needed time is rather useless.

So my “raw” data comes from a power meter that gives for every kWh 1000 pulses.
Therefore the amount of data points per time unit is dependend from my actual power consumption. And this varies over time.
I had a rough look into the datapoints and as a worst case scenario we can calculate with one point per minute (although in reality it will be most likely less than that).

OK now the evaluation of the duration:
Aggegate Window 1day, Time range to display is
1day: Immediately (milliseconds)
7days: 1s approx.
30 days: 3s approx.
90days: 8s
last 6 months: 13s

last year: > 9min 30s

What now is bothering me is that I get from last year query 3 results (colors). Therefore I attached pictures from last 6 months and last year.
Is maybe this the reason for this long running query?
For me 13s - 6 month extrapolated to around half a minute for 1 year would be ok.
So is there maybe something in my raw data that causes the problem and not within the query?

In general the problem is, that I don’t have only one of those queries :slight_smile:
I want do display in parallel the production and consumption per day and year. This sums up to 4 panels with those queries…

Maybe too much for the raspbi 4 it runs on :wink:

In parallel I dig into node red a little bit more to maybe do the time*power during this time calculations in node red BEFORE it goes into influx. Then I could get rid of the map functions and elapse function. The already existing data I can transform with a small programm (java e.g.).
But the most “easiest” way, if it exsists, is to just do it in grafana…

I can, if necessary, also evaluate the times needed for aggegate window of one month…

The more I dig into it the more I get confused.
Ok query with map functions can be slow, thats understood.

The first thing I am confused about is that with one query I get with a certain timerange 3 results back ( as visible in image of my last entry).
This happens also directly on influxdb with much easier query (and a time range choosen such that only 2 result curves are comming back from flux query)

The only difference I can see in the results data is that the “table” column is set to 1 instead of 0.


But a column with name “table” doesn’t exists in my data/on the influxdb itself.
Where does this comes from? I did a rough search in the help of influxdb but found nothing …

The reason for this question is that (this is my impression at least) as long as I don’t query over this weird “borders” in my data the queries are rather fast. Rather means it could be more performant but maybe this is limited by cpu power of my raspbi on which the influx db runs …

Another question is: Would it be better to ask the guys from influxdb directly and this is the wrong place to ask this question?

Hi @monstereye

Welcome to the sometimes confusing world of InfluxDB! You can try posting on their forum, but this forum is more active and I think answers come quicker. Having said that, you may want to read through this to better understand the Flux table structure.

Try inserting this statement at the end of your query:

|> group()

and see if the two tables become one.

We can work on the speed issue next.

1 Like

No they don’t


Question for Influxdb itself?
In the meantime I found a Github Issue entry dealing with this (but not as the main topic unfortunately … )

They state: For non-timeseries results, we current include two useless columns: “results” and “table.” This PR removes them:

But I reduced my query to just query the data (with filters) and removed testwise the map functions and so on. So I don’t see why the query result should not be non-timeseries result …

When you mouse over the two colors, the legend should display. That should tell you the difference between the two series (or at least their names). Can you do this and find out the two names?

image

Also, your first column (from the other screenshot) indicated a column with the _result value shown. What is this and can you drop it from the overall results?

image

What are the specs of the rpi and does it meet the minimum hardware requirements from influx

You can install it on rpi does not mean you should.

1 Like

@grant2 Thanks for your answer!
Ok legend tells me


No real difference visible, therefore I tried to post the CSV Data …

If I try to add |> drop(columns: ["_result"]) to my query the result is still the same.

And, if I click on “view raw data” in influxdb or try to display the queried data as a table influxdb shows an error.

Your second question regarding “_result” and “table” columns:
This is exactly my question!
I don’t know why they are there. I think they are NOT contained on the influxdb itself (Although I am not really sure).
If I drop “_result” (as explained above) nothing happens.
If I drop “table” and “_result” then I still get two result lines back from the query.


One is showing the data, the other one a constant value…

Weird…

I try to find some time over the weekend to dig more into this. Maybe with a little bit Java code and SQL Queries to see if this behaviour is reproducible there too…

@yosiasz You are right! It’s possible but maybe not advised to do so.
My original question was (since I am a newbie regarding flux language) if I can do something better to improve the query itself. If not, I will have a look for a host with more power :slight_smile:
Nevertheless this topic has become more a discussion about weird content of query result …
But thanks for your answers!

Hi again.

I am slowly becoming convinced that your data input / collection method changed at some point and that is how you ended up with two different tables.

For now, let’s work solely in Influx Data Explorer and use this toggle to see the raw data:
image

OK, based on the earlier screenshot, I do not believe the column heading was “_result”. “_result” was the value / contents in the column. So this command is not going to work:

I the screenshot below, I highlighed in yellow some of the column titles. What is the column title for the one showing _result?

I went back through this thread to re-familiarize myself with your situation. Did you change something back in March 2022? It would appear that your data was just slightly modified (twice) which is why it gets rendered as different series.

I don’t write (At least I am not aware of it) a result column. But it seems like it’s there and is sometimes filled with null and sometimes with _result.
I don’t know why.

What I did now is this:

from(bucket: “loxone/autogen”)
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r[“_measurement”] == “energie2”)
|> filter(fn: (r) => r[“_field”] == “value”)
|> filter(fn: (r) => r[“category”] == “Energie”)
|> filter(fn: (r) => r[“name”] == “Verbrauchszähler”)
|> group()
|> drop(columns: [“result”])
|> aggregateWindow(every: 10m, fn: mean, column: “_value” )
|> elapsed(unit: 1s)
|> map(fn: (r) => ({ r with power_kWh: float(v: r.elapsed)/3600.0 * float(v: r._value)}))
|> aggregateWindow(every: 1mo, fn: sum, column: “power_kWh” )
|> drop(columns: [“_start”, “_stop”,“category”, “name”, “room”, “state”, “type”, “_field”, “_measurement”])

The group() statement directly after the filter, makes sure all data is summarized in one table.
This didn’t worked at the end of the query but in between it seems to work properly.

Then I had the Idea that the map statement runs faster on less data. So I tried to used this fact by an additional aggregateWindow. This is to be handled carefully (at least this is what my experiments tell me) because if the aggregate window to reduce the data is set to too low time ranges it slows even further down the query run time.
The idea behind is (and this is only true for my case I think) that I can live with a certain uncertainty over huge time ranges. The uncertainty comes with the mean function in the first aggregate window.

The next performance increasement is (I hope at least) to reduce the map functions to one map function. In the beginning I had three which I now reduced to one.

Last thoughts: As the documentation of aggregateWindow tells it returns only the column on which the window was running. Therefore the additional drop statement on category, name etc. can possibly be skipped.

Assessment is: I cannot tell why the data within influxdb is as it is. I don’t know where the column with name “result” comes from (I don’t write it) or why it has sometimes null and sometimes _result as value set.

In anyway: Thanks a lot for your help!!

idea, not sure that it works for your situation:
why not creating a task in influx to calculate the results every (eg 15 minutes) and use this data in grafana to prevent that grafana has to do the full calculation,
I have more or less the same situation: data is stored in influx, delta time is not always the same and i want to have the aggregated data in 15 minues (every 15 minutes)

a task in influx calculates every 15 minutes the quarter-hourly demand and store this in another bucket.

this is what i do: pick all the stored records every 15 minutes, do the calculation, sum the result and store it in a new bucket. That one is used in grafana.
(In this case i grab the used power that was measured in a timeframe, multiply this with the elapsed timeframe in seconds, sum the result into one value, and divide this by the number of seconds in 15 minutes → this gives me the quarter-hourly demand)
For now this works, i have to check if the result is correct in all cases.

from(bucket: “Mars”)
|> range(start: -15m, stop: now())
|> filter(fn: (r) => r[“_measurement”] == “mqtt_nonjson”)
|> filter(fn: (r) => r[“_field”] == “value”)
|> filter(fn: (r) => r[“host”] == “bergen”)
|> filter(fn: (r) => r[“topic”] == “shellies/mars/home/totalpower/emeter/1/power”)
|> filter(fn: (r) => r[“_value”] > 0.0)
|> elapsed(unit: 1ms)
|> map(fn: (r) => ({ r with _value: float(v: r.elapsed) * float(v: r._value)}))
|> sum()
|> map(fn: (r) => ({ r with _value: float(v: r._value)/900000.0}))
|> drop(columns: [“_start”,“_time”, “topic”, “_measurement”,“host”])
|> set(key:“_measurement”, value:“kwartiervermogen net”)
|> duplicate(column: “_stop”, as: “_time”)
|> drop(columns: [“_stop”])
|> to(bucket: “kwartierdata”)

Not sure if this helps you.

1 Like

This helps indeed.

What is obvious now is, that there is no faster query giving the same result and that most of the time needed is consumed by the map, elapsed and aggregate window functions.

So reducing the amount of data that is calculated at once and spread this over time as the data points are written to influxdb should do the trick.
I was up to lock for kapacitor and then found out kapacitor is not able to handle influxdb 2.
Then I read your reply to my question.

My solution is now exactly this. Writing a query for this task that runs every 15 minutes (or even one day would also be possible I think). The task (with 15 minutes time range) looks like this:

option task = { 
  name: "power_consumption_per_15min",
  every: 15m,
}

from(bucket: "loxone/autogen")
  |> range(start: -15m, stop: now())
  |> filter(fn: (r) => r["_measurement"] == "energie2")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["category"] == "Energie")
  |> filter(fn: (r) => r["name"] == "Verbrauchszähler")
  |> group()
  |> drop(columns: ["result"])
  |> elapsed(unit: 1s)
  |> map(fn: (r) => ({ r with power_kWh: float(v: r.elapsed) /3600.0  * float(v: r._value)}))
  |> sum(column: "power_kWh")
  |> duplicate(column: "_stop", as: "_time")
  |> drop(columns: ["_stop", "_start"])
  |> to(bucket: "power_consumption_production")

I reduced the two map functions to one.
elapse with time unit of 1s is enough accuracy for me. I don’t need calibrated evaluations :wink:
Spikes in the power flow (to be honest) are already smoothed out by only 1000 pulses/kWh of the power meter…

Now I am writing a small java program based on an InfluxDB client written in java to handle the 2 years of data points that are already written into the data base.
If code quality is good enough :slight_smile: I will post a github link after wards.

2 Likes

Hi guys

I dig a little bit deeper into pulses, time ranges and so on and changed my task a little bit.
More is to be find here: https://community.grafana.com/t/power-consumption-per-day-with-non-equidistant-data-points/76406/9

Task looks like this now:
import “contrib/tomhollingworth/events”

option task = {
name: “power_consumption_per_day”,
cron: “0 0 * * *”,
}

from(bucket: "loxone/autogen")
    |> range(start: -24h, stop: now())
    |> filter(fn: (r) => r["_measurement"] == "energie2")
    |> filter(fn: (r) => r["_field"] == "value")
    |> filter(fn: (r) => r["category"] == "Energie")
    |> filter(fn: (r) => r["name"] == "Verbrauchszähler")
    |> group()
    |> drop(columns: ["result"])
    |> events.duration(unit: 1s)
    |> map(fn: (r) => ({r with power_kWh: float(v: r.duration) / 3600.0 * float(v: r._value)}))
    |> sum(column: "power_kWh")
    |> duplicate(column: "_stop", as: "_time")
    |> drop(columns: ["_stop", "_start"])
    |> map(
        fn: (r) => ({
            _time: r._time,
            _value: r.power_kWh,
            _measurement: "energie_consumption_production",
            _field: "value",
            category: "Energie",
            name: "Verbrauchszähler",
            raum: "Zentral",
            state: "actual",
            type: "Meter",
        }),
    )
    |> to(bucket: "power_consumption_production")
2 Likes