Please help creating subqueries in flux

calebmazure123 · January 18, 2021, 10:56pm

I am having trouble creating a flux query for a database.

Description:
What I am trying to do is first filter a bucket/table by column event -> “created” and then return a list of uid’s that have the filter.

This is easy enough however, from that list of uid’s I then want to display users who have the event created/started/finished (fields).

Problem being that when you filter the column by the first filter it removed the rest of the fields. The data is in the where there are columns uid and event… etc.

If you can think of it as sort of a subquery from sql it might make sense, in that you are filtering “on the fly”.

Things I have tried:

    bucket1 = from(bucket: "bucketname")
      |> range(start: v.timeRangeStart)
    |> filter(fn: (r) =>  r["event_name"] =~ /testClientStarted|SimpleGameReprot|createExperimentGroup/)
      |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
      |> group(columns: ["userid"], mode:"by")
            |> map(fn: (r) => ({ 
          ID: r["userid"],
          stat: r["event_name"],
          times: r["_time"]
      }))

    bucket2 = from(bucket: "bucketname")
      |> range(start: v.timeRangeStart, stop:v.timeRangeStop)
      |> filter(fn: (r) =>  r["event_name"] == "createExperimentGroup")
      |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
      |> group(columns: ["userid"], mode:"by")
          |> map(fn: (r) => ({ 
          ID: r["userid"],
          stat: r["event_name"],
          times: r["_time"]
      }))

    join(
      tables: {events: bucket1, amount: bucket2},
      on: ["ID"]
    )

This is my initial attempt at doing the query however. The timeframes are important in that the first table shows just “from” while the second shows “from” to “now” (what I want). This join method didn’t work as it was returning to much data as the first table (just from date selected) would return a bulk load of data. This is then NOT filtered on the fly (like I want) which causes timeout issues.

Second Attempt (more promising maybe):

bucket2 = from(bucket: "bucket")
  |> range(start: v.timeRangeStart)
  |> filter(fn: (r) =>  r["event_name"] == "createExperimentGroup")
  |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
  |> group(columns: ["userid"], mode:"by")
  |> map(fn: (r) => ({ 
      id: r["userid"],
  }))

from(bucket: "bucket")
  |> range(start: v.timeRangeStart, stop:v.timeRangeStop)
  |> filter(fn: (r) =>  r["event_name"] =~ /testClientStarted|SimpleGameReprot|createExperimentGroup/)
  |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
  |> group(columns: ["userid"], mode:"by")
  |> filter(fn: (r) => contains(value: r["userid"], set: bucket2))

This is what I was envisioning would work but I couldn’t find a way to get it working. “Set” requires an array which is why I got this error message.
500 Internal Server Error: {"error":"keyword argument \"set\" should be of kind array, but got object"}

I tried using getColumn(column:“id”) for the bucket2 variable to return an array but this didn’t seem to work (error message).
500 Internal Server Error: type error 9:6-9:28: missing object properties (schema)

Any advice and help on how to do this would be very much appreciated as I have exhausted the documentation online.

Thanks!

system · January 18, 2022, 10:57pm

This topic was automatically closed after 365 days. New replies are no longer allowed.