Data frame design: Basics

Data sources translate data from external services and APIs into a format that panels can understand. In other words, data sources produce data and panels consume data.

Generally, data sources don’t know how the data they produce will be presented. While they can suggest a preferred type of visualization, the user is ultimately free to visualize the data in any way they want.

In practice, this means that the panel needs the data source to produce the data it needs to do its job. If a data source produces incomplete data, it’s up to the panel to inform the user what’s missing. Conversely, while a query may return 20 fields, the panel might only use two of them and ignore the rest.

What is a data frame?

The format that data sources and panels use to communicate is called a data frame. Think of a data frame as a table with a set of rows and columns, where the columns in the data frame are called fields.

Much like a SQL table, a field can have a name and a type. The name describes what the field represents, and the type determines what kind of data the field contains, for example strings, numbers, or booleans.

Note: Data frames (and fields) can contain additional metadata, such as labels or custom stats, but in this post we’ll focus on just the field name and type.

In essence, a data frame is merely a container of named and typed data that panels assign meaning to, by assigning the fields to different dimensions of a visualization. For example, by using a numeric field to control the radius in a scatter plot.

Data frame examples

To see how data frames are used in practice, let’s look at a few core visualizations to see how they consume data frames.

You can conveniently display the data frame passed to a visualization, by toggling Table view while in the panel edit mode. The column header shows an icon to indicate the type of the field.

Time series

If you want your data source to support the Time series visualization, make sure that your data frame has at least one Time field and one Number field:

time (Time) cpu (Number)
1635312236502 23
1635319376502 40
1635319796502 46

Note: The Time series panel supports additional formats that we’ll explore in a future post.

Logs

Just like time series, logs have a Time field, but instead of a numeric data, the Logs visualization expects the second field to be a String field:

time (Time) message (String)
1635312236502 user created
1635319376502 item added to order
1635319796502 order paid

Note: The Logs panel supports additional formats that we’ll explore in a future post.

Table

If the data frame doesn’t have a Time field, you can still count on the Table visualization. The Table displays the entire data frame, which also makes it great for debugging.

Design for interoperability

With the flexibility of freely combining any data sources and panels, comes the responsibility to play nice with other plugins.

  • If you’re developing a panel plugin, then document the data frame format it expects. Otherwise, your users need to guess what type of query they need to build.

  • If you’re developing a data source plugin, then avoid imposing unnecessary restrictions on how the user can visualize it. Unconventional data formats limit the usability of the data source.

But then what constitutes a “conventional” data format? How should you design your data to support as many data sources or panels as possible?

During the coming weeks, we’ll publish more articles on data frame design. Get notified of new content by clicking the bell icon in the top-right corner of the Share & showcase category, and then selecting Watching.

Meanwhile, check out the Data frames documentation for a more in-depth explanation of data frames.

For code examples on how to work with data frames, refer to Working with data frames.

Are you building a panel plugin? What does it expect from the input data frame?

4 Likes

When trying to find out what kind of dataframes one should produce to feed into a panel, there’s an useful tool:
the TestData datasource:

  • create a dashboard-panel with the visualization you are interested in
  • choose the TestData datasource
  • choose the “Raw Frames” option, and you can simply enter (in JSON format) a list of dataframes directly, and check how they get visualized

If you know a datasource which does work with the panel and it’s a backend-data-source (for example, Prometheus), switch to that datasource, make a query that works, open your browser’s dev-tools, and copy the AJAX-response for the given query. Paste this into the above-mentioned test-datasource, and it will work. From there you can start modifying the json-data to see what does work, and what does not. (NOTE: for this to work, the source datasource has to be a backend-datasource, because those return dataframes in the AJAX response)

2 Likes

UPDATE: (just for transparency) - I have edited Marcus’ original post to correct/update the links in his post that were 404’ing. As of today, they’re good.