Aggregating data (less jiggery graphs) AFTER applying a Transformation

heissdaniel · April 30, 2023, 10:41pm

What is the data source situation?
I query from an InfluxDB2. For (more or less) the same timestamp there are always two rows, one with tag “ch1”, the other with tag “ch2”.

What do I want to achieve?
I want to plot not the values of “ch1” and “ch2”, but the Difference of the two values.

First naive try and why it does not work as I want:
I make two queries like this:

from(bucket: "myBucket")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter( some Filter )
  |> filter(fn: (r) => r["channel"] == "ch1")
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)

of course the other filters for “channel” = “ch2”. Additionally I add in Grafana a Transformation “Binary operation” and choose “minus” and I activate “Replace all fields”. And this works!
Problem: The Graph which is plotted is very very “jiggery”. It does look very bad.

What do I want to have changed?
I would like to use the result of above and then aggregate the data such that e.g. I chop all data into 2-minute-pieces (or even 5-minute-pieces) and use for each of these intervals the MAXIMUM of the values. I hope I explained good enough.
However, I cannot google how this can be done. And ChatGPT is really really bad with Grafana…

A second solution which works, but is bad in another way:
instead of making two queries and a transformation I can do all this logic in the query itself. I will paste the query below. This query yields exactly what I am looking for. BUT: The evaluation of this query takes A LOT of time. On every refresh it takes like 10 seconds to give me data (if I restrict the data to a 24-hour-window, if I go for larger time span, it takes even way more…).
The query:

from(bucket: "myBucket")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> truncateTimeColumn(unit: 1s)
  |> filter( some filters )
  |> filter(fn: (r) => r["channel"] == "ch1" or r["channel"] == "ch2")
  |> pivot(rowKey: ["_time"], columnKey: ["channel"], valueColumn: "_value")
  |> map(fn: (r) => ({r with delta: r.ch1 - r.ch2}))
  |> drop(columns: ["ch1", "ch2"])
  |> aggregateWindow(every: 2m, fn: max, column: "delta")

My hope:
The two queries and the transformation from the first naive solution are very fast, so I was hoping there is a way that Grafana post-processes the data after the transformation to achieve similar results as the slow solution 2.

I did try to google that but I was not able to find something. I also tried ChatGPT but I was surprised how bad ChatGPT is when it comes to Grafana (in comparison to its coding skills)

jangaraj · May 1, 2023, 8:35am

I don’t believe that Grafana’s postprocessing will be faster that TSDB (InfluxDB in this case) native processing.

I bet that your first query will be slower, when you use 2m minute aggregation as you used in the 2nd query.

If you really focus on the performance then precalculate results in the InfluxDB with InfluxDB tasks.

heissdaniel · May 1, 2023, 8:39am

thank you for your time and response!
For beginner this is a valuable information!

edit: I have tried, but I can modify my first queries even to a 5-minute aggression and the evaluation is almost instant. As fast as everything else in Grafana…

grant2 · May 1, 2023, 9:32am

I believe the pivot (on the unaggregated data) is what is slowing down your query. Can you try aggregating first, and then pivoting, like this?

from(bucket: "myBucket")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> truncateTimeColumn(unit: 1s)
  |> filter( some filters )
  |> filter(fn: (r) => r["channel"] == "ch1" or r["channel"] == "ch2")
  |> aggregateWindow(every: 2m, fn: max, column: "delta")
  |> pivot(rowKey: ["_time"], columnKey: ["channel"], valueColumn: "_value")
  |> map(fn: (r) => ({r with delta: r.ch1 - r.ch2}))
  |> drop(columns: ["ch1", "ch2"])

heissdaniel · May 1, 2023, 10:41am

also thanks to you for your time and thought.
The problem with your suggestion is that if I first aggregate, then this changes dramatically what I am measuring. Originally I have the more or less precise difference between the “ch1” and “ch2” data and then I aggregate keeping the maximum of the difference.
Your suggestion would aggregate “ch1” and “ch2” to their maximum and then take the difference.

Sloppy speaking “maximum of the difference does not equal the difference of the maxima”

heissdaniel · May 1, 2023, 10:45am

Can someone image if the following would work:
I use my first naive solution as it produces very quickly the correct data and is just to “jiggery”. Then I hide this panel somehow and create a new panel using the result of this first panel as “input” source which I then “aggregate” to a certain window. Does Grafana offer such a way?

grant2 · May 2, 2023, 3:01am

Can you redo your data ingestion so that ch1 and ch2 data points are recorded with the same timestamp? I do something similar where I am recording the actual and setpoint at the same exact time. In InfluxDB, I do this and it works fast and is precise:

  |> filter( some filters )
  |> filter(fn: (r) => r["MeasType"] == "actual" or r["MeasType"] == "setpoint")
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)
  |> pivot(rowKey:["_time"], columnKey: ["MeasType"], valueColumn: "_value")
  |> map(fn: (r) => ({r with delta: r.setpoint - r.actual}))

Topic		Replies	Views
Grafana - adjust aggregatewindow to allow graphs over different timescales? InfluxDB flux	5	4440	November 7, 2023
[Flux] When applied "group()" after aggregateWindow() in a query, graph doesn't show InfluxDB	0	4920	March 23, 2022
Aggregation issues Configuration	2	1488	April 14, 2022
How can I merge rows in a grafana table panel?	9	10852	January 23, 2021
Grafana - Adding ( Summing) 2 different queries Dashboards influxdb	8	13629	November 4, 2023

Aggregating data (less jiggery graphs) AFTER applying a Transformation

Related topics