[Flux] When applied "group()" after aggregateWindow() in a query, graph doesn't show

Hi,

I have a bucket in InfluxDB that contains 1.7 million data (or records) in it. Each data has fields such as “mass_pm2_5”, “num_pm2_5” and they have float values. In addition, each data has “tags” such as “location”, “origin”, etc. Data are recorded once a minute for a couple of years so far.

I want to draw a graph by Grafana as following query.

from(bucket: "sps30")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "sps30")
  |> filter(fn: (r) => r["_field"] == "mass_pm2_5")
  |> drop(columns: ["location", "origin", "unit_id"])
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: true)
  |> yield(name: "mean")

The reason why I put “drop(…)” here is, I wanted NOT to distinguish “origin” of data. Some data’s “origin” is “mqtt” and others’ “origin” is “copied_from_elastic” for example, but I want to draw one connected line regardless the “origin”.

However, I noticed the above drop() requires very long time as reported below. (For example, if I specify the “range(start: -5y)” and “aggregateWindow(every: 1d)”, it requires more than a minutes to get a reply.)

https://community.influxdata.com/t/influxdb-drop-measurement-extremely-slow/13984

As a workaround, I stopped to use drop(), but now the graph is shown by two different lines (of different colors). One line came from the tag “origin” has value “mqtt” and another came from the different “origin”.

In such use case, what is a recommended solution to realize that?

One idea of mine was to place “group()” as following. (I found this technique here.)

from(bucket: "sps30")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "sps30")
  |> filter(fn: (r) => r["_field"] == "mass_pm2_5")
//  |> drop(columns: ["location", "origin", "unit_id"])
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: true)
  |> group()  // added
  |> yield(name: "mean")

However, by putting the “group()”, no graph line show anymore by Grafana.

By examining the issue by InfluxDB Data Explorer web GUI, I noticed, if we omit “group()”, “_filed” column is shown as “GROUP”, but if we put “group()”, “_field” becomes “NO GROUP”. Is this related to the problem that graph lines don’t show by Grafana?

By examining the issue by Grafana Query Inspecter, if put “group()”, we see the following without a graph. (Saved as CSV.)

"_start","_stop","_time","_value","_field","_measurement","location","origin","unit_id"
022-03-23 16:10:48.488,2022-03-23 19:10:48.489,2022-03-23 16:11:30,,mass_pm2_5,sps30,veranda,mqtt,1
2022-03-23 16:10:48.488,2022-03-23 19:10:48.489,2022-03-23 16:11:40,8.30,mass_pm2_5,sps30,veranda,mqtt,1
2022-03-23 16:10:48.488,2022-03-23 19:10:48.489,2022-03-23 16:11:50,,mass_pm2_5,sps30,veranda,mqtt,1
2022-03-23 16:10:48.488,2022-03-23 19:10:48.489,2022-03-23 16:12:00,,mass_pm2_5,sps30,veranda,mqtt,1
2022-03-23 16:10:48.488,2022-03-23 19:10:48.489,2022-03-23 16:12:10,,mass_pm2_5,sps30,veranda,mqtt,1
2022-03-23 16:10:48.488,2022-03-23 19:10:48.489,2022-03-23 16:12:20,,mass_pm2_5,sps30,veranda,mqtt,1
2022-03-23 16:10:48.488,2022-03-23 19:10:48.489,2022-03-23 16:12:30,,mass_pm2_5,sps30,veranda,mqtt,1
2022-03-23 16:10:48.488,2022-03-23 19:10:48.489,2022-03-23 16:12:40,8.97,mass_pm2_5,sps30,veranda,mqtt,1
2022-03-23 16:10:48.488,2022-03-23 19:10:48.489,2022-03-23 16:12:50,,mass_pm2_5,sps30,veranda,mqtt,1
(snip)

If we omit “group()”, we see the following with a graph. (Saved as CSV.)

"Time","mass_pm2_5 {location=""veranda"", origin=""mqtt"", unit_id=""1""}"
2022-03-23 16:09:30,
2022-03-23 16:09:40,7.37
2022-03-23 16:09:50,
2022-03-23 16:10:00,
2022-03-23 16:10:10,
2022-03-23 16:10:20,
2022-03-23 16:10:30,
2022-03-23 16:10:40,7.70
2022-03-23 16:10:50,
(snip)

Currently, the only workaround which I found is, using InfluxQL instead of Flux queries. (It works fine.)

Thank you for your any suggestions.

My environment is as following.

  • Grafana v8.4.4
  • InfluxDB 2.1

Running them as Docker containers. (Though I don’t think it is the reason of the issue.)

By the way, I’m using a default panel setting of “Time series” graph, so omitted a panel setting JSON file.

If you need additional information, please let me know.

Regards,
Atsushi

1 Like