How can reading a report be like reading a comic? Trellis Charts

How can reading a report be like reading a comic? Comics have a particular accepted structure, where each panel tells its own story, but the whole number of panels together tell a complete story. In a comic, the individual panels combine together to form a narrative. If you are interested in reading more about the structure of comics, then I recommend that you read ‘Understanding Comics: The Invisible Art’ by Scott McCloud. I first saw the idea of relating reporting to comics in the brilliant Dataspora blog, and I was interested to relate this concept to SQL Server Reporting Services. The purpose of this blog will show you how easy it is to create a Trellis report in reporting services, followed by some data visualisation principles and some thoughts on interpreting the data, and finally some references for further reading. 

I have been presenting this material at some of the SQL Server User Groups in the UK, and thought it might be useful to give the information out in blog format too. 

In the sphere of reporting, it is also possible to construct a panel of reports in order to form a narrative of what the data is telling the data consumer. The term for a panel of reports is a trellis chart (Cleveland, 1997) is a ‘chart of charts’ – it is a multi-panel of small, similar charts. The design of a trellis is hallmarked by the arrangement of the small charts in a lattice or grid fashion, whereby the charts are arranged into rows, columns or even pages.

Each chart in the trellis is similar; for example, all the charts in the trellis should be the same style, for example, scatterplot, bar charts and so on. However, each chart differs in terms of the category variable that is displayed. This means that the trellis structure can help the data consumer to understand the interactions between the variables that produce the result.

Trellis charts are useful for displaying data that is multivariate and/or has large data sets.  If a chart has a lot of bars and lines, then it can become difficult for the user to understand. By splitting the data out by category, Trellis charts can help the data consumer to understand the structure of the data by removing the need for 3D representation, or by having too many bars and lines.  Here is an example of a trellis chart. If you can’t see it properly, then please click here to go to flickr:

CompletedTrellisExample



The above trellis chart shows the BP Statistical Review of World Energy 2010, and shows the expected resources for oil production in Thousand Million gallons per day. I have provided a cut down version of the CSV file here, for your reference. 

This blog will explain how to create the above trellis chart using SQL Server Reporting Services 2008 R2. The reason we use this version of SQL Server is because it has the Matrix component, as we will see as we progress through this blog.

First things first: a data source! We will need a table that is populated with data, and a stored procedure to extract the data. I have given you samples below.

1.      Create and Populate the Table
If you need a script to create a table, here is one for you:

CREATE TABLE [dbo].[WorldRegionBPProvedReserves](
       [Year] [float] NULL,
       [Region] [nvarchar](255) NULL,
       [Country] [nvarchar](255) NULL,
       [Thousand Million Barrels] [float] NULL
) ON [PRIMARY]

You can import the CSV file into the table using the SSIS wizard in SSMS, just to make life easier for you.

2.      Extract the data using a Stored Procedure

Once the data has been imported, you’ll need a stored procedure to extract the data from the database. Here is a sample stored procedure for you:

/****** Object:  StoredProcedure [dbo].[usp_MV_WorldRegionBPProvedReserves]    ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[usp_MV_WorldRegionBPProvedReserves]
–exec [dbo].[usp_MV_WorldRegionBPProvedReserves]
AS
SELECT [Year]
      ,[Region]
      ,[Country]
      ,CAST([Thousand Million Barrels] AS INT) AS [Thousand Million Barrels]
  FROM [SQLBits].[dbo].[WorldRegionBPProvedReserves]
 WHERE
 Country IN
 (
‘Canada’,‘Iran’,‘Kazakhstan’,‘Kuwait’,‘Mexico’,‘Other Europe & Eurasia’,‘Russia’,‘Saudi Arabia’,‘United States of America’
 )
 
GO

Once you’ve created the stored procedure, you should test that it works so far by executing it:

EXEC [dbo].[usp_MV_WorldRegionBPProvedReserves]

Et Voila! You should see the data. If you don’t see the data, then you will need to resolve this issue before you try to move forward.

3.      Create a new Project, Data Source and Dataset

Once you are confident that the stored procedure is returning data, it is time to create a new project, with a corresponding data source and a dataset. If you need information on how to do this, then please refer to the Microsoft tutorials for clear explanations on how to create a new project, data sources and datasets.
Once these have been created, add a new report to the project, and give it an appropriate name.

4.      Using the Matrix Component to display and group

Use the Matrix component; simply drag it from the toolbox, and pop it onto the canvas. I like to name my components, so I am going to call this one Matrix_WorldRegionBPProvedReserves.
Take the chart component, and drag it to the matrix cell. Select a 2D Bar Chart.
Then, the bar chart will need to be populated with data. We will display the data by region and by year.

The grouping section of the SSRS report should appear as follows:

1. Grouping of Rows and Columns



In other words, the rows should be grouped by years, and the columns should be grouped by columns.

5.      Populating each chart with data
The next step is to populate each chart with data. For the values, we want to display the number of barrels measured in thousand million units. For the category grouping, we want to select ‘Country’. Here is a sample picture:

2. Populating the chart with data


6.      Data Visualisation – making the trellis consistent

Remove the legend from each chart – it gives the data more room to ‘breathe’, and is unnecessary repetition. If it helps, make it clear in the title, or create a new legend outside of the individual charts. Removing the unnecessary repetition means that the data/ink ratio is maximised, since unnecessary redundant information is removed.

Remove the chart title from each chart – again, this is unnecessary repetition. The chart title takes up space, and it can be easily placed outside the trellis structure.

Remove the axis title – again, this is repeat information. In this case, it is obvious that the axis is a country, and there is no need to add it in.

The Axis scale needs to be made consistent – if not, it will confuse the users. If this is done automatically, it will vary from chart to chart. However, it is important that the axis sizes all match correctly, or the user will get incorrect information. To adjust the axis size, right click on the axis and set the maximum value to be the same for all charts.

7.      Interpreting the Report

One of my favourite quotes from ‘Now you See it’ by Stephen Few is that ‘comparison is the beating heart of analysis’. This is one of my mantras for data visualisation. It’s clear from the trellis example that, if the charts were all different sizes, it would be very difficult to compare. However, by ensuring that the X-axis of the individual charts all match, then it is comparison is facilitated. Put another way, if the X-axis for each chart does not match, then the data navigator may misinterpret the data.

When we preview the report, it should appear as follows. If you can’t see it properly, then please click here to go to flickr:

CompletedTrellisExample



The data is interesting. If you look at the data for Mexico, for example, then you can see that the expected resources have decreased from 1989 to 2009. Since the trellis chart allows us to compare up and down easily, we can see changes over time. We can also see a similar trend for ‘Other Europe and Eurasia’, which is a combination of the resources for some smaller European countries. 

We can also compare ‘across’ the table as well as ‘down’ the table. For example, Russia, Iran and Canada have increased their expected resources from 1989 to 2009.

This chart could be further improved by adding in colour to show the maximum value, for example, or by adding in more countries to look for further patterns. To summarise, the trellis chart can be a useful narrative for displaying data that is multidimensional in nature, and means that 3D is not necessary. By using the analogy of reading a comic, it is hoped that the trellis chart can be more easily understood and introduced to people who may not be familiar with the concept.

References
Trellis Display from Bell LabsVisualizing Data by William S. Cleveland (1993)Envisioning Information by Edward R. Tufte (1990). This provides a clear explanation of the ‘Small Multiples’ concept, as well as the ‘data ink’ ratio.

One thought on “How can reading a report be like reading a comic? Trellis Charts

  1. Great post, Jen. I especially appreciate the part on how to make the scale consistent and eliminate redundant data legends. Very useful idea for using small multiples in SSRS.

Leave a Reply