SQLUniversity: Introduction to PowerPivot and Mobile Business Intelligence

Here is the second post for SQL University, where we introduce the topic of mobilising PowerPivot data. As an introduction, I thought it might be useful to share the presentations that I did recently. Tomorrow’s blog will talk more about the details of mobilising PowerPivot data using Tableau, but I thought that these articles might provide some introductory material for people to use, and re-use, in order to expose the usefulness of PowerPivot as a data source. Here is a brief overview:

Essentially,the Ordnance Survey and Census data were mashed up together and linked together in PowerPivot. The PowerPivot was then used as a source to Tableau, which was used to visualise the data.  As an introduction to the process, I generated some presentations to use when I was talking around this solution. and I have provided these presentations here.
Here is a very introductory presentation on PowerPivot in Sharepoint. This presentation was designed to be given in around 5 minutes before being supported a series of demos, and the intended audience was people who hadn’t seen PowerPivot previous to the presentation.

The following presentation came from SQLBits, where I gave a discussion on mobilising business intelligence with PowerPivot. Tableau Software was used in order to display the data from the PowerPivot, and the next blog in the series will present more technical detail on how the solution was built.

Here is the presentation that I produced for SQL Server Days in Belgium in November 2011:

I hope that you enjoy these presentations, which I’ve provided for people to use and enjoy as they wish. Tomorrow’s SQLUniversity post will provide more detail on the mobilisation of PowerPivot with Tableau.

Windows Azure Marketplace – what data sources would you like to see?

During my presentations at SQLBits, SQLRelay and other UK User Group meetings, I have been dismayed by the lack of awareness of the Windows Azure Marketplace. This blog aims to explore some of the reasons that this may be happening, and I’d also like to canvass you, dear reader, so you can highlight the data sources that you would like to have in the Datamarket.
First of all, the Windows Azure Datamarket is not to be confused with the Datamarket, which is a company based in Iceland which sounds similar. The Windows Azure Datamarket is a broad reaching collection of subscription-based data services, including applications and a variety of data for consumers and businesses to utilise. It is available in 26 countries, as at the time of writing in October 2011. It is a marketplace in the sense that it is possible to purchase and sell data and applications. The types of data available include financial, property, geographical data, retail data and even fun sports data. The data from the Windows Azure Marketplace can be consumed by Excel, Tableau and Visual Studio.
One intention of the Windows Azure Marketplace is that it will support business analysts everywhere, in their quest for clean, up-to-date data. I believe it is potentially a very powerful source of data for enterprises. For example, by provisioning clean, “looked after”, up-to-date datasets, it can reduce the amount of effort in looking after external data. In other words, companies who already ‘clean up’ external data sets might look to the Windows Azure Marketplace in order to see if there are existing datasets that could be rented. It’s the old problem of ‘outsource or internal spend’ – but at least it is good to have options to explore.
So, given the potential for the Windows Azure Marketplace as a potential data store, why the lack of awareness or uptake? Out of my recent travels to various User Groups, SQLBits and so on, hardly anybody had heard of it, never mind actually used it in production.  I am guessing that one reason for this is that the data stores aren’t plentiful with UK-focused datastores.  My research showed that there were a number of UK data sources available. These included:
In other words, not very many sources! My search was hampered for the fact that the search string must contain at least three characters. Therefore, if you are searching for ‘UK’ then you are stuffed! I am guessing that the uptake isn’t very strong since the UK-focused data needs to be grown. In my opinion, I guess that this will happen over time.  Since there is an Excel add-in for the Marketplace, the route to uptake of this service is clear. I think that this will take time, and it is potentially a very powerful tool for analysts and researchers.
Hence this blog: I am wondering what UK data sources you would like to see? Here is my list of free data sources that I’d love to see on the Marketplace as a one-stop-shop for data requirements:
The Guardian Datastore – basically anything that they produce. Love it!
UK Census data – since the next Census is out soon in the UK, it would be particularly relevant to have this information
The Data Archive – Social Sciences and Humanities data for the UK. Not as esoteric as they might sound since they also discuss the future of data sources. This is a reflective data store, and I’d recommend that you take a look at it.
Health and Safety Executive Data – Risk Control, Public health and comparison with other European countries
Heidi – I have never been able to access this, but it is available to Education planners. 
The Treasury also offer UK data on finance and key financial indicators
The Bank of England offers a wealth of financial data, focused on the UK
Office for National Statistics – data on agriculture, children, economy, government, travel… you name it!
If you can think of any other data sources you would like to see on the Windows Azure Datamarket, then please leave a comment. I’d love to hear from you and you’d also satisfy my never-ending thirst for more data sources!

Mobile Business Intelligence – Try it out!

Thank you to everyone who attended my SQLBits ‘Mobile Business Intelligence in Action’ session recently. If you are interested to try out Mobile Business Intelligence on your iPad or mobile device, here are the links below:

Jedi Knight Actuals of UK Census 2001 Dashboard 
Jedi Knights Percentage of UK Census 2001 Dashboard
AdventureWorks Sales by Geography Dashboard
AdventureWorks Actuals Sales
AdventureWorks Analysis Dashboard

I haven’t tried this on every browser and every device, so I would be very interested in your feedback.
I look forward to hearing from you. Please leave a comment below, or email me at jenstirrup [at] jenstirrup.com

Representing data about the iPad

The current blog will take three different ways of representing the same data set, in order to see how it can be done simply and clearly – or not so clearly. I have taken some samples, and reworked them as a progression throughout this blog.

Although I am discussing the iPad here, this is not a preview about my iPad and Mobile Business intelligence sessions which I’m delivering at SQLBits session in October, or my User Group sessions in Leeds and Surrey this year; however, obviously the iPad is very much in my mind, hence the perpendicular topic of this blog!

The dataset is interesting because it aims to show the impact of the iPad announcement on notebook sales. This study was conducted by NPD, Morgan Stanley Research. CNN Money has written a short article on the impact of the iPad on netbook sales, which proposes that the iPad is at least ‘partially’ responsible for the decline in netbook sales. The rather dramatic bar chart, which underlines this point, is given here:
There are a few issues with the bar chart:

 – The axis doesn’t go from 0 – 100%, which I would expect, given that it is supposed to show percentages. This skews the results slightly; for example, the 70% seems higher.
– 3D gradient issues don’t add anything. Sometimes 3D can make an image look more ‘pretty’. Here, the 3D does not add anything ‘pretty’ or enhance anything about the message of the data
– it’s not clear why the data has been represented as distinct categories when time is continuous rather than discrete
– the big pink arrow shouldn’t have been necessary; the graphic should have been enough.
– there is nothing to make the negative value stand out, or to distinguish it in any way.

There have been other examples of the same data, re-visualised. Here is an example from a wonderful infographic, which has been completed by the Focus Group. I have taken an excerpt of it here since the whole infographic is not the focus of this blog:

iPad and Notebook sales by the Focus Group

The above infographic solves some of the issues of the earlier version, which was reproduced by CNN money.

– There is no 3D
– The big arrows have gone

However, although it is visually appealing, it does repeat some of the earlier issues found in the CNN money chart, since the scale still does not reach 100% on the Y axis. Further, it also introduces some new issues:

– The black background might be visually appealing, but as a ‘best practice’, a white background is better. This allows the representation of the data to dominate the scene, not the background or other non-necessary items.
– hatched lines replace the arrows, to denote the time of the announcement of the iPad and the actual release of the iPad. This is an issue because it is slightly jarring to the eye.
– the month timeline isn’t evenly marked in terms of months; it is therefore difficult to ascertain if the data is skewed horizontally in any way.

In order to improve these representations of the data, I have used Tableau in order to create a simple line graph. This was all that was needed in order to get the message across, without skewing it or obscuring it in any way. Here is my example below, which can also be found on the Tableau Public website:

iPad and Notebook Sales

I have removed the issues found in the earlier visualisations and added some further enhancements:

The negative growth percentage has been highlighted with red colouring
added in clean annotations which do not obscure other parts of the data visualisation
ensured that the Y-axis shows 100% so that the data is not skewed
used a line graph since the X-axis is continuous, not discrete
removed the black background to emphasise the components of the data that provide the message of the data

Although the data visualisation has been improved, there are still contextual answers which the graph cannot answer:

– what about the impact of the iPhone, or other tablets?
– what about the impact of the time of year e.g. post-Christmas sales?
– what about the impact of the impending recession?

Therefore, the initial analysis as described by CNN money simply provided a ‘headline’ message, and further analysis would need to be conducted in order to answer the question more fully. That said, a proper visualisation of the data is a useful tool towards getting the ‘bigger picture’ right, as well as the ‘smaller picture’.

I hope that this was interesting, and look forward to your comments.
Jen x

Business data: 2D or 3D?

One debate in data visualisation can be found in the deployment of 2D or 3D charts. Here is an interesting assessment here, conducted by Alasdair Aitchison, and it is well worth a read.
3D visualisations are good for certain types of data e.g. spatial data. One good example of 3D in Spatial analysis is given by Lie, Kehrer and Hauser (2009) who provide visualisations of Hurricane Isabel. 3D has also been shown to be extremely useful for medical visualisation, and there are many examples of this application. One example for many parents is a simple, everyday miracle: anyone who has known the experience of seeing their unborn child on a screen will be able to tell you of the utter joy of seeing their healthy child grow in the womb via the magic of medical imaging technology. Another example of this work has been conducted in cancer studies, where the researchers have visualised tumours in order to detect brain tumours (Islam and Alias, 2010). 
For me, data visualisation is all about trying to get the message of the data out to as many people as possible. Think John Stuart Mill’s principle of utilitarianism – the maximum happiness to the most amount of people. In data visualisation, similar applies; we can make people happy if they get at their data. However, for the ‘lay public’ and for business users, 3D isn’t good for business data because people just don’t always ‘get’ it easily. Note that medical staff do undertake intensive training in order to assess scans and 3D images, and this subset is excluded from the current discussion, as is spatial data. Hopefully, by restricting the ‘set’ of users to business users, the argument goes from the general to the specific, where it is easier to clarify and give firmer answers to the ‘grey’ subject of data visualisation.
Data Visualisation is not about what or how you see; it’s ‘other-centric’. It’s about getting inside the head of the audience and understanding how to help them see the message best. It is often difficult to judge what business users – or people in general – will find easiest to understand. It is also difficult to ascertain what visualisations can best support a given task. Ultimately, I like to stick to the best practices in order to try and answer the data visualisation question as well as possible and to make things as clear for everyone as possible.
Part of my passion for data visualisation comes from personal experience; I was told when I was quite young that I was going blind in one eye. Fortunately, this proved not to be the case, and I can see with two eyes. When my son was born, I saw him with two eyes, and for that I am extremely grateful. Having been through the experience of learning that I may go through life with impaired vision, I have been blessed to understand how precious our vision is, and to try and do something positive for others who have struggled with their vision. This experience has made me passionate about trying to make things as clear for everyone else as possible, so I guess the personal experience has made me so passionate about making data visualisation accessible to everyone, as far as possible.
One particularly relevant issue in data visualisation is the  debate over 2D over 3D – namely, whether to use 3D in data visualisation or not. Here, I specifically refer to the visualisation of business data, not Infographics. 
On one hand, 3D can make a chart or dashboard look ‘pretty’ and interesting. In today’s world, where we are bombarded with images and advanced graphical displays, we are accustomed to expecting ‘more’ in terms of display. We do live in a 3D world, and our visual systems are tuned to perceive the shapes of a 3D environment (Ware, 2004). 
The issue comes when we try to project 3D onto a 2D surface; we are trying to add an additional plane onto a 2D surface. This is a key issue in data visualisation, since we are essentially trying to represent high-dimensional entities onto a two-dimensional display, whether it is a screen or paper. 
Generally speaking, 3D takes longer for people to assimilate than 2D graphs, and they are more difficult to understand. Not everyone has good eyesight or good innate numerical ability, and its’ about getting the ‘reach’ of the data to as many people as possible without hindering or patronising. Perceptually, 2D is the simplest option, and the occlusion of data points is not an issue. Business users are also often more familiar with this type of rendering and it is the ‘lowest common denominator’ in making the data approachable to the most number of people. 
On the other hand, there is some evidence to suggest 3D graphs can, on occasion, be more memorable initially, but this isn’t any good if the data wasn’t understood properly in the first place. It can also be more difficult to represent labels and textual information about the graph. 
In terms of business data, however, 3D Graphs can break ‘best practice’ on a number of issues:
 – Efficiency. Numbering is inefficient since it can be difficult to compare. “Comparison is the beating heart of analysis” (Few) In other words, we should be trying to help users to get at their data in a way that facilitates comparison. If comparison isn’t facilitated, then this can make it more difficult for the users to understand the message of the data quickly and easily.
 – Meaningful. A graph should require minimum explanation. If users take longer to read it, and it increases cognitive load, then it can be difficult to draw meaningful conclusions. The introduction of 3D can mean chartjunk, which artificially crowds the ‘scene’ without adding any value. If you crowd the ‘scene’, then this can naturally distract rather than inform.
 – Truthful. The data can be distorted; occluding bars are just one example. If the labels are not correctly aligned or have labels missing, this can also make the 3D chart difficult to read.
 – Aesthetics. It can make the graph look pretty but there are other ways to do this which don’t distract or occlude.
Stephen Few has released a lot of information about 3D and I suggest that you head over to his site and take a look. Alternatively, I can recommend his book entitled ‘Now you See it‘ for a deeper reading since it describes these topics in more detail, along with beautiful illustrations to allow you to ‘see’ for yourself.
To summarise, what should people do – use 2d only? Here is the framework of a strategy towards a decision:
 – Look at the data. The data might be astrophysics data, in which the location of the stars, and its type, could be identified by colour and brightness as well as location. If the data is best suited to 3D, such as spatial, astrophysics or medical data, then that’s the right thing to do. If the data is business data, where it is important to get the ‘main point’ across as clearly and simply as possible, then 2D is best since it reduces the likelihood of misunderstandings in the audience. Remember that not everyone will be as blessed with good sight or high numerical ability as you are!
Look at the audience. 3D can be useful if the audience are familiar with the data. I had a look at Alastair’s 3D chart and I have to say that I am not sure what the chart is supposed to show, probably because I’m not clear on the data. I am not an expert in spatial data, so I don’t ‘get’ it. So I ask for Alastair’s understanding in my perspective that I don’t understand the spatial data in his blog, so I will be glad to defer to his judgement in this area (no pun intended). If you can’t assume that the viewers are familiar with the data, then it’s probably common sense to make it as simple as possible.
 – Look at the Vendors. Some vendors, e.g. Tableau, do not offer 3D visualisations at all, and bravely take the ‘hit’ from customers, saying that they are sticking to best practice visualisations and that’s the second, third, fourth, fifth and final opinion on the matter. 
In terms of multi-dimensional data representation, there are different methodologies in place to display business data that don’t require 3D, such as parallel co-ordinates, RadViz, lattice charts, sploms, scattergrams. I have some examples on this blog and will produce more over time. Further, it is also possible to filter and ‘slice’ the data in order to focus it towards the business question at hand, so that it is easier for business users to understand. 
I hope that SQL Server Denali Project Crescent will help business users to produce beautiful, effective and truthful representations of business data. I believe that business users will eventually start doing data visualisations ‘by default’ because it is inbuilt to the technology that they are using. Think of sparklines, which are now availabe in Excel 2010 – this was exciting stuff for me! Hopefully Project Crescent will go down this route towards excellent data visualisation but I recognise it will take time.
To summarise, the way around the ‘3D or not to 3D’ in business data is to offer such beautiful, effective, truthful visualisations of business users’ data that adding 3D wouldn’t add anything more to them. The focus here has been on business users, since that’s where my experience lies; there are plenty of good examples of 3D in spatial, astrophysics and medical imaging, but my focus is on business users . 
To conclude, my concern is to get the message of the data is clearly put across to the maximum number of people – think John Stuart Mill again!
Photo of Stephen Few
The UK Tableau User Group committee are delighted to announce the Keynote speaker for our next event is Stephen Few. If you would like to register for this event, please do visit the registration site here

Stephen writes the quarterly Visual Business Intelligence Newsletter, speaks and teaches internationally, and provides design consulting. In 2004 he wrote the first comprehensive and practical guide to business graphics entitled Show Me the Numbers, in 2006 he wrote the first and only guide to the visual design of dashboards, entitled Information Dashboard Design, and in 2009 he wrote the first introduction for non-statisticians to visual data analysis, entitled Now You See It. Personally, I think that both of these books are fantastic, and I always recommend them at SQL Server User Groups and to my customers.

Please note that only a maximum of two (2) delegates will be admitted per organisation. Should any organisation wish for further delegates to attend an email can be sent to uktug@uktug.com – however, I suspect that there will be no places left but you can try your luck! Any places allocated from the waiting list will be confirmed just prior to the event.

Full details of the Agenda will be published on http://www.uktug.com and the LinkedIn UK Tableau Users Group. 

Tableau and PowerPivot Presentation

As some of you may know, I recently presented at the Tableau European User Conference which was held in Amsterdam in May 2011. The topic was ‘Using Tableau with PowerPivot’, in which I explored the idea of PowerPivot as a data source and Tableau as the presentation layer. I wrote a blog on the topic of Tableau and PowerPivot recently, and the purpose of this post is to simply share the slides.

I thought that the slides might be useful, so I have posted them for you here.

I hope that you enjoy, and please do let me know your comments.