Influx - drop the first and last rows/values from the table

jjm · November 26, 2024, 5:15pm

Is there an easy way to drop the first and last values/rows?

I was thinking something along the lines of this:

|> filter(fn: (r) => r != first() and r != last())

ebabeshko · November 26, 2024, 6:36pm

Which task are you trying to solve?

Here is a simple code snippet for filtering out first and last records:

import "sampledata"

data = sampledata.int()

first = data
    |> first()
    |> tableFind(fn: (key) => key.tag == "t1")
    |> getRecord(idx: 0)

last = data
    |> last()
    |> tableFind(fn: (key) => key.tag == "t1")
    |> getRecord(idx: 0)

data
  |> filter(fn: (r) => r._value != first._value and r._value != last._value)

jjm · November 27, 2024, 2:53pm

It’s a bit related to another issue I was trying to solve: InfluxDB sum of all values - #3 by jjm

I wanted to sum all the values, but the inconsistent times were causing issues.

Through lots of trial and errors, I finally got it working consistently well with this:

option minWindowPeriod = 2m

from(bucket: "bucket1")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "Cisco-IOS-XR-sysadmin-fretta-envmon-ui:environment/oper/power/location/pem_attributes")
  |> filter(fn: (r) => r["_field"] == "system_power_input" and r["_value"] < 1000)
  |> group(columns: ["source"])
  |> window(every: if float(v: int(v: v.windowPeriod) * 2) < float(v: int(v: minWindowPeriod)) then minWindowPeriod else duration(v: int(v: v.windowPeriod) * 2))
  |> mean()
  |> duplicate(column: "_stop", as: "_time")
  |> window(every: inf)
  |> group(columns: ["_time"])
  |> sum(column: "_value")
  |> duplicate(column: "_time", as: "_stop")
  |> duplicate(column: "_time", as: "_start")  
  |> group()
  |> yield(name: "power")

The only issue is that the first and last value may not end up with values from all devices, making the first and last sum potentially incorrect.

I modified the query based on your example:

option minWindowPeriod = 2m

power = from(bucket: "bucket1")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "Cisco-IOS-XR-sysadmin-fretta-envmon-ui:environment/oper/power/location/pem_attributes")
  |> filter(fn: (r) => r["_field"] == "system_power_input" and r["_value"] < 1000)
  |> group(columns: ["source"])
  |> window(every: if float(v: int(v: v.windowPeriod) * 2) < float(v: int(v: minWindowPeriod)) then minWindowPeriod else duration(v: int(v: v.windowPeriod) * 2))
  |> mean()
  |> duplicate(column: "_stop", as: "_time")
  |> window(every: inf)
  |> group(columns: ["_time"])
  |> sum(column: "_value")
  |> duplicate(column: "_time", as: "_stop")
  |> duplicate(column: "_time", as: "_start")  
  |> group()

first = power
    |> first()
    |> tableFind(fn: (key) => key.tag == "t1")
    |> getRecord(idx: 0)

last = power
    |> last()
    |> tableFind(fn: (key) => key.tag == "t1")
    |> getRecord(idx: 0)

power
  |> filter(fn: (r) => r._value != first._value and r._value != last._value) 
  |> yield(name: "power")

But I’m getting an error:

 error calling function "tableFind" @20:8-20:47: no table found

My guess is I modified the default influx table in such a way, that this doesn’t work anymore.

Here’s a CSV of the my original query as an example of what the table looks like.

#group,false,false,false,false,false,false
#datatype,string,long,dateTime:RFC3339,double,dateTime:RFC3339,dateTime:RFC3339
#default,power,,,,,
,result,table,_time,_value,_stop,_start
,,0,2024-11-27T13:50:00Z,4092,2024-11-27T13:50:00Z,2024-11-27T13:50:00Z
,,0,2024-11-27T13:52:00Z,18499.5,2024-11-27T13:52:00Z,2024-11-27T13:52:00Z
,,0,2024-11-27T13:54:00Z,18502.5,2024-11-27T13:54:00Z,2024-11-27T13:54:00Z
,,0,2024-11-27T13:56:00Z,18504,2024-11-27T13:56:00Z,2024-11-27T13:56:00Z
,,0,2024-11-27T13:58:00Z,18507.5,2024-11-27T13:58:00Z,2024-11-27T13:58:00Z
,,0,2024-11-27T14:00:00Z,18505.5,2024-11-27T14:00:00Z,2024-11-27T14:00:00Z
,,0,2024-11-27T14:02:00Z,18505.5,2024-11-27T14:02:00Z,2024-11-27T14:02:00Z
,,0,2024-11-27T14:04:00Z,18502,2024-11-27T14:04:00Z,2024-11-27T14:04:00Z
,,0,2024-11-27T14:06:00Z,18504.5,2024-11-27T14:06:00Z,2024-11-27T14:06:00Z
,,0,2024-11-27T14:08:00Z,18502,2024-11-27T14:08:00Z,2024-11-27T14:08:00Z
,,0,2024-11-27T14:10:00Z,18506.5,2024-11-27T14:10:00Z,2024-11-27T14:10:00Z
,,0,2024-11-27T14:12:00Z,18500.5,2024-11-27T14:12:00Z,2024-11-27T14:12:00Z
,,0,2024-11-27T14:14:00Z,18502,2024-11-27T14:14:00Z,2024-11-27T14:14:00Z
,,0,2024-11-27T14:16:00Z,18502.5,2024-11-27T14:16:00Z,2024-11-27T14:16:00Z
,,0,2024-11-27T14:18:00Z,18497,2024-11-27T14:18:00Z,2024-11-27T14:18:00Z
,,0,2024-11-27T14:20:00Z,18494.5,2024-11-27T14:20:00Z,2024-11-27T14:20:00Z
,,0,2024-11-27T14:22:00Z,18495,2024-11-27T14:22:00Z,2024-11-27T14:22:00Z
,,0,2024-11-27T14:24:00Z,18497,2024-11-27T14:24:00Z,2024-11-27T14:24:00Z
,,0,2024-11-27T14:26:00Z,18498.5,2024-11-27T14:26:00Z,2024-11-27T14:26:00Z
,,0,2024-11-27T14:28:00Z,18500.5,2024-11-27T14:28:00Z,2024-11-27T14:28:00Z
,,0,2024-11-27T14:30:00Z,18498,2024-11-27T14:30:00Z,2024-11-27T14:30:00Z
,,0,2024-11-27T14:32:00Z,18497.5,2024-11-27T14:32:00Z,2024-11-27T14:32:00Z
,,0,2024-11-27T14:34:00Z,18501.5,2024-11-27T14:34:00Z,2024-11-27T14:34:00Z
,,0,2024-11-27T14:36:00Z,18505.5,2024-11-27T14:36:00Z,2024-11-27T14:36:00Z
,,0,2024-11-27T14:38:00Z,18509,2024-11-27T14:38:00Z,2024-11-27T14:38:00Z
,,0,2024-11-27T14:40:00Z,18505.5,2024-11-27T14:40:00Z,2024-11-27T14:40:00Z
,,0,2024-11-27T14:42:00Z,18501.5,2024-11-27T14:42:00Z,2024-11-27T14:42:00Z
,,0,2024-11-27T14:44:00Z,18499,2024-11-27T14:44:00Z,2024-11-27T14:44:00Z
,,0,2024-11-27T14:46:00Z,18502,2024-11-27T14:46:00Z,2024-11-27T14:46:00Z
,,0,2024-11-27T14:48:00Z,18503,2024-11-27T14:48:00Z,2024-11-27T14:48:00Z
,,0,2024-11-27T14:49:52.671401559Z,18503.5,2024-11-27T14:49:52.671401559Z,2024-11-27T14:49:52.671401559Z

As you can see the values are more or less consistent through time, as they should, but the first if widely off and at the times, the last can be as well. Best way to solve it would be to simply drop them from the table.

jjm · November 29, 2024, 1:04pm

This topic seems to be discussing the exact same problem: Removing first & last data points

jjm · November 29, 2024, 1:56pm

Ok, I think I’ve finally got something that works.

rowCount = data
  |> count(column: "_value") // Count total rows
  |> findRecord(fn: (key) => true, idx: 0) // Extract row count as a record

data
  |> tail(offset: 1, n: int(v: rowCount._value) - 2) // Remove the first and last row

The whole query then looks like this:

option minWindowPeriod = 2m

data = from(bucket: "bucket1")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "Cisco-IOS-XR-sysadmin-fretta-envmon-ui:environment/oper/power/location/pem_attributes")
  |> filter(fn: (r) => r["_field"] == "system_power_input" and r["_value"] < 1000)
  |> group(columns: ["source"])
  |> window(every: if float(v: int(v: v.windowPeriod) * 2) < float(v: int(v: minWindowPeriod)) then minWindowPeriod else duration(v: int(v: v.windowPeriod) * 2))
  |> mean()
  |> duplicate(column: "_stop", as: "_time")
  |> window(every: inf)
  |> group(columns: ["_time"])
  |> sum(column: "_value")
  |> duplicate(column: "_time", as: "_stop")
  |> duplicate(column: "_time", as: "_start")  
  |> group()

rowCount = data
  |> count(column: "_value") // Count total rows
  |> findRecord(fn: (key) => true, idx: 0) // Extract row count as a record

data
  |> tail(offset: 1, n: int(v: rowCount._value) - 2) // Remove the first and last row
  |> yield(name: "power")

Topic		Replies	Views
Removing first & last data points Time Series Panel influxdb , sql	5	983	February 17, 2025
GroupBy 1day, last row value is always empty InfluxDB influxdb , query-help , time-series-panel	0	293	March 31, 2023
Reducing Series to First Value Bar Gauge Panel	5	414	July 11, 2024
Table showing last value with conditional	4	6142	December 9, 2019
Table with Influx 2.0 InfluxDB plugins , influxdb , query-help , flux	18	6479	August 3, 2022

Influx - drop the first and last rows/values from the table

Related topics