Having trouble with an outer join

  • What Grafana version and what operating system are you using?

Windows 10 / Grafana v7.5.6

  • What are you trying to achieve?

I have two separate tables that have information about a common location, what I would like to do is combine the information of the two tables centered on the location.

The issue is that its combining/reducing all the information of both table A and table B into a single point.

For example if the Points table is setup like this -

Date, Location, Temp, Humidity 
date1, (Lat, Long), 80, 20 
date2, (Lat, Long), 75, 15
date3, (Lat, Long), 78, 10
date4, (Lat, Long), 77, 12

And Locations Table is setup like this -

Location, City, Contact, Stations, Type
(Lat, Long), Metropolis, weather@city.com 
(Lat, Long), , , 3, JSON

I would like the table to look like this (order doesn’t matter) -

Date, Location, Temp, Humidity, Location, City, Contact, Stations, Type
date1, (Lat, Long), 80, 20, Metropolis, weather@city.com, 3, JSON
date2, (Lat, Long), 75, 15, Metropolis, weather@city.com, 3, JSON
date3, (Lat, Long), 78, 10, Metropolis, weather@city.com, 3, JSON
date4, (Lat, Long), 77, 12, Metropolis, weather@city.com, 3, JSON
  • How are you trying to achieve it?

So after I created the panel, I then used two search queries A & B,

Select * from Points
Select * from Locations Group By "Locations"

Then I clicked on the Transform Tab and selected Outer Join centered on Location.

  • What happened?

The table that is produced looks like this -

Date, Location, Temp, Humidity, Location, City, Contact, Stations, Type
date, (Lat, Long), 77, 15, Metropolis, weather@city.com, 3, JSON

It’s a single entry from Table A (possibly an Average?)

  • What did you expect to happen?

I expected all the rows from Table A to be present and then for table B to fill in the extra data on top of table A.

Can anyone give me a hand in setting this up?

Thank you in advanced

I little update -

After a bit of testing, I found that when the Transform > Outer Join is selected, it is choosing the last value for each (lat/long) in the Points table to form the Join on.

I tried adding a unique Tag to each Point in Points and then using the query -

Select * from Points Group By "UniqueTag", "Location"

To see if I could force it to recognize each value separately, but when I use the Outer Join transformation, it just reduces everything down to a single last Location.

Any thoughts?

Hi @Nefariis,

What is your datasource?

@mattabrams - InfluxDB 1.8.6

Two quick thoughts on this:

  1. The Grafana “Outer join” transform is unfortunately not actually an outer join. See Transforms: Rename outer join transform to something more descriptive · Issue #27325 · grafana/grafana · GitHub and Transformations: Implement proper outer join · Issue #26316 · grafana/grafana · GitHub. The “Merge” transform actually does something that’s closer to an outer join, though it seems to be hit-and-miss.
  2. @Nefariis I’m not sure if this is a helpful comment, but it might be worth re-evaluating whether Influx is the best database for this kind of use case. At least one of your tables/measurements doesn’t appear to be timeseries data to begin with. I suspect a SQL database would be a better fit, especially given the need to do joins on tables.

I think you can achieve this by moving to flux instead of influxql. Slightely meta version of the querry.

// Defining points table
points_table = from(bucket:"data source here")
 |> range(start: -90m, stop: now())
 |> filter(fn: (r) => r._measurement == "Points" and .. )

// Defining location data table
locations_table = from(bucket:"data source here")
 |> range(start: -90m, stop: now())
 |> filter(fn: (r) => r._measurement == "Locations" and .. )

// Running the querries and then joining the tables
all_points = join(
  tables: {locations:locations_table, points:points_table},
  on: ["Location"]

// Running the join

I don’t think you need to use outer join for this. If the tables are in the same database, you can solve this with inner join. I did, but my datasource is sqlite, so it all depends what your datasource allows you to do.
I’m a grafana noob, somebody will probably correct me…

My example:
Table 1 (queries)

Table 2 (network_addresses)

My query:

SELECT strftime('%d/%m/%Y %H:%M:%S', datetime(queries.timestamp, 'unixepoch', 'localtime')), count(queries.domain), queries.domain, queries.client, network_addresses.name FROM queries INNER join network_addresses ON queries.client = network_addresses.ip WHERE status = '1' GROUP BY queries.domain ORDER BY MAX(queries.timestamp) DESC LIMIT 10;

You’ll need some overrides to show the desired display name, but it works for me the name comes out of the second table)…

Again, I’m a grafana noob, so sorry if the info is irrelevant, just trying to help, I spend a lot of time trying to do this with outer join, and failed to find a working solution.

I’ve been working with the influx team this last week on converting my InfluxQL dashboard to Flux - but the performance of my dashboard took a significant hit, 30-40 second queries for each panel instead of the 2-3 second queries InfluxQL provided - Apparently Flux wasn’t designed with tables in mind.

I do have some time series stuff, but the majority of it is in table form - So I spent last night getting some test data into a SQLite database to make sure Grafana would play nice and that I could get similar query times and functionality to InfluxQL - and so far so good.

So I find myself this morning trying to psyche myself up to delete huge chunks of the front end and back end to prepare for a new SQLite implementation and the ensuing refactor.

… maybe after breakfast and another coffee…