Having trouble with the outer join transformation

  • What Grafana version and what operating system are you using?

Windows 10 / Grafana v7.5.6

  • What are you trying to achieve?

I have two separate tables that have information about a common location, what I would like to do is combine the information of the two tables centered on the location.

The issue is that its combining/reducing all the information of both table A and table B into a single point.

For example if the Points table is setup like this -

Date, Location, Temp, Humidity 
date1, (Lat, Long), 80, 20 
date2, (Lat, Long), 75, 15
date3, (Lat, Long), 78, 10
date4, (Lat, Long), 77, 12

And Locations Table is setup like this -

Location, City, Contact, Stations, Type
(Lat, Long), Metropolis, weather@city.com 
(Lat, Long), , , 3, JSON

I would like the table to look like this (order doesn’t matter) -

Date, Location, Temp, Humidity, Location, City, Contact, Stations, Type
date1, (Lat, Long), 80, 20, Metropolis, weather@city.com, 3, JSON
date2, (Lat, Long), 75, 15, Metropolis, weather@city.com, 3, JSON
date3, (Lat, Long), 78, 10, Metropolis, weather@city.com, 3, JSON
date4, (Lat, Long), 77, 12, Metropolis, weather@city.com, 3, JSON
  • How are you trying to achieve it?

So after I created the panel, I then used two search queries A & B,

Select * from Points
Select * from Locations Group By "Locations"

Then I clicked on the Transform Tab and selected Outer Join centered on Location.

  • What happened?

The table that is produced looks like this -

Date, Location, Temp, Humidity, Location, City, Contact, Stations, Type
date, (Lat, Long), 77, 15, Metropolis, weather@city.com, 3, JSON

It’s a single entry from Table A (possibly an Average?)

  • What did you expect to happen?

I expected all the rows from Table A to be present and then for table B to fill in the extra data on top of table A.

Can anyone give me a hand in setting this up?

Thank you in advanced

I little update -

After a bit of testing, I found that when the Transform > Outer Join is selected, it is choosing the last value for each (lat/long) in the Points table to form the Join on.

I tried adding a unique Tag to each Point in Points and then using the query -

Select * from Points Group By "UniqueTag", "Location"

To see if I could force it to recognize each value separately, but when I use the Outer Join transformation, it just reduces everything down to a single last Location.

Any thoughts?

Hi @Nefariis,

What is your datasource?

@mattabrams - InfluxDB 1.8.6

Two quick thoughts on this:

  1. The Grafana “Outer join” transform is unfortunately not actually an outer join. See Transforms: Rename outer join transform to something more descriptive · Issue #27325 · grafana/grafana · GitHub and Transformations: Implement proper outer join · Issue #26316 · grafana/grafana · GitHub. The “Merge” transform actually does something that’s closer to an outer join, though it seems to be hit-and-miss.
  2. @Nefariis I’m not sure if this is a helpful comment, but it might be worth re-evaluating whether Influx is the best database for this kind of use case. At least one of your tables/measurements doesn’t appear to be timeseries data to begin with. I suspect a SQL database would be a better fit, especially given the need to do joins on tables.
1 Like

I think you can achieve this by moving to flux instead of influxql. Slightely meta version of the querry.

// Defining points table
points_table = from(bucket:"data source here")
 |> range(start: -90m, stop: now())
 |> filter(fn: (r) => r._measurement == "Points" and .. )

// Defining location data table
locations_table = from(bucket:"data source here")
 |> range(start: -90m, stop: now())
 |> filter(fn: (r) => r._measurement == "Locations" and .. )

// Running the querries and then joining the tables
all_points = join(
  tables: {locations:locations_table, points:points_table},
  on: ["Location"]
 )

// Running the join
all_points

1 Like

I don’t think you need to use outer join for this. If the tables are in the same database, you can solve this with inner join. I did, but my datasource is sqlite, so it all depends what your datasource allows you to do.
I’m a grafana noob, somebody will probably correct me…

My example:
Table 1 (queries)

Table 2 (network_addresses)

My query:

SELECT strftime('%d/%m/%Y %H:%M:%S', datetime(queries.timestamp, 'unixepoch', 'localtime')), count(queries.domain), queries.domain, queries.client, network_addresses.name FROM queries INNER join network_addresses ON queries.client = network_addresses.ip WHERE status = '1' GROUP BY queries.domain ORDER BY MAX(queries.timestamp) DESC LIMIT 10;

You’ll need some overrides to show the desired display name, but it works for me the name comes out of the second table)…

Again, I’m a grafana noob, so sorry if the info is irrelevant, just trying to help, I spend a lot of time trying to do this with outer join, and failed to find a working solution.

I’ve been working with the influx team this last week on converting my InfluxQL dashboard to Flux - but the performance of my dashboard took a significant hit, 30-40 second queries for each panel instead of the 2-3 second queries InfluxQL provided - Apparently Flux wasn’t designed with tables in mind.

I do have some time series stuff, but the majority of it is in table form - So I spent last night getting some test data into a SQLite database to make sure Grafana would play nice and that I could get similar query times and functionality to InfluxQL - and so far so good.

So I find myself this morning trying to psyche myself up to delete huge chunks of the front end and back end to prepare for a new SQLite implementation and the ensuing refactor.

… maybe after breakfast and another coffee…

2 Likes

This past week, I have been working with the Flux team on converting my InfluxQL dashboard to Flux - however, I was extremely disappointed with the performance of the dashboard, with the performance of each panel taking 30-40 seconds instead of the 2-3 seconds InfluxQL provided. Apparently Flux wasn’t designed for tables.

While I have some time series stuff, the majority of it is in the form of tables, so last night I spent some time getting a SQLite database online in order to test Grafana’s functionality and see if I would be able to get some query time and functionality similar to InfluxQL - so far, so good.

It’s taken me until now to gather up the courage to delete large chunks of the front end and back end code in order to prepare for the new SQLite implementation and the subsequent refactoring that follows.

Yeah, I ended up deleting just about all my influx code and made the switch to SQLite for this particular project.

I do still use Influx for when I specifically need something in a time-series format - but anything table / calculation related SQLite seems to be the answer.

This topic was automatically closed after 365 days. New replies are no longer allowed.