Speed up long running Flux Query

This helps indeed.

What is obvious now is, that there is no faster query giving the same result and that most of the time needed is consumed by the map, elapsed and aggregate window functions.

So reducing the amount of data that is calculated at once and spread this over time as the data points are written to influxdb should do the trick.
I was up to lock for kapacitor and then found out kapacitor is not able to handle influxdb 2.
Then I read your reply to my question.

My solution is now exactly this. Writing a query for this task that runs every 15 minutes (or even one day would also be possible I think). The task (with 15 minutes time range) looks like this:

option task = { 
  name: "power_consumption_per_15min",
  every: 15m,
}

from(bucket: "loxone/autogen")
  |> range(start: -15m, stop: now())
  |> filter(fn: (r) => r["_measurement"] == "energie2")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["category"] == "Energie")
  |> filter(fn: (r) => r["name"] == "Verbrauchszähler")
  |> group()
  |> drop(columns: ["result"])
  |> elapsed(unit: 1s)
  |> map(fn: (r) => ({ r with power_kWh: float(v: r.elapsed) /3600.0  * float(v: r._value)}))
  |> sum(column: "power_kWh")
  |> duplicate(column: "_stop", as: "_time")
  |> drop(columns: ["_stop", "_start"])
  |> to(bucket: "power_consumption_production")

I reduced the two map functions to one.
elapse with time unit of 1s is enough accuracy for me. I don’t need calibrated evaluations :wink:
Spikes in the power flow (to be honest) are already smoothed out by only 1000 pulses/kWh of the power meter…

Now I am writing a small java program based on an InfluxDB client written in java to handle the 2 years of data points that are already written into the data base.
If code quality is good enough :slight_smile: I will post a github link after wards.

2 Likes