Upcoming National and International Speaking Events

The blog has been a bit quiet since I’m busy preparing for some up and coming events. Here is a list of some things which are keeping me busy!

Attunity hosted webinar – Faster Business Insights By Enabling Real-time Data For BI & Analytics, Webinar, 26th January

Reserve your space today! If users rely on reports that include stale or outdated information, the impact on ‘same day’ decision cycles is far greater than you think. Join us for this special webinar to hear industry business intelligence (BI) expert, Jen Stirrup share her insights about new advances in technologies to better enable real-time data for BI and analytics. Attend this webinar to learn about:

  • Best practices to achieve low-latency data movement
  • How to overcome costly obstacles to provide low-latency data movement
  • New cost-efficient techniques to enable real-time data for BI and analytics
  • New approaches for implementing change data capture (CDC) technology with data replication
  • Live demonstration
  • And a lot more!

In addition, Richard Thomas, Attunity’s Director of Technical Services will discuss how Attunity Replicate software plays a critical role to deliver real-time information across your organization. Plus, Jeff Cole, an Attunity Solutions Architect, will provide a live product demonstration. Don’t miss this opportunity.

Sweden SQL Server User Group at the World Trade Center, Stockholm, 30th January

We are pleased to announce that Jen Stirrup, joint owner of Copper Blue Consulting, will be participating in the Swedish User Group meeting on 30th January at the World Trade Center in Stockholm. Jen will be speaking, in English, on Data Visualisation and Business Intelligence.  If you’d like to register, please click here.

SQL Saturday Ireland Technical Launch for SQL Server 2012 at the Hilton Hotel, Dublin, 24th March

We are pleased to announce that Allan Mitchell and Jen Stirrup, joint owners of Copper Blue Consulting, will be giving individual presentations at the SQL Saturday event in Dublin on 24th March. Allan will be discussing Data Quality, and Jen will be discussing Data Visualisation and Business Intelligence. The event is supported by the Professional Assocation of SQL Server and we will be attending the after-event party. We hope to see you there! To Register, please click here.

SQLBits UK Technical Launch for SQL Server 2012, London, 31st March


Allan and I are both presenting individual presentations at SQLBits. Allan will be talking about CDC in SQL Server 2012. I’ll be talking about Data Visualisation and Business Intelligence in SQL Server 2012. The Saturday event is free, so please do come along and join us. The event will be held at the Novotel London West  and for more information, please register here. Incidentally the SQLBits site is Powered by SQL Server 2012, so it’s worth a look for that reason, too!

PowerPivot in Denali: User Feedback and the Fourth Golden Rule of Interface Design

Here, we will look at an example of PowerPivot user feedback that is improved in Denali from SQL Server 2008 R2.  User feedback is a vital part of any software. In particular, this is especially important in the case of error handling, regardless of whether it is a pre-emptive measure in preventing error handling, or a post-error warning after the error has taken place.
Schneiderman has specified a series of 8 ‘Golden Rules’ for interface. Generally, I think that the ‘golden rules’ apply fairly well to graphs and charts, and I’m interested to explore this idea – hence the current series of blogs! 
Schneiderman’s Fourth Golden Rule is to offer a Design Dialogue to give the user a sense of closure. Thus, the activity has a beginning, middle and an end. The information at the end of the activity provides a sense of closure to the user; it means that the path to the next step is clear for the user, who can move on from anticipating any errors or issues with the activity in question.
There is one improvement in PowerPivot Denali which is illustrative of this point. In the current 2008 R2 release of PowerPivot, the ‘New Measure’ box provides us with an opportunity to check that the Measure formula is correct. Here is an example below. Can you spot what the issue might be? 
Check RELATED formula with Red ball

The issue is that, although the formula is technically correct, the initial visual feedback to the user shows a red ball. Generally, in the West, red denotes a ‘warning’ colour, to show a problem; so we have red traffic lights, for example.  A more appropriate notification to the user would be a green icon, which would indicate ‘success’ or ‘go’.  In Denali PowerPivot, this is exactly the improvement that we see, and this can be seen from the same screen in the Denali version. Here is an example below:

Check RELATED formula with green ball

Although this seems like a little thing, it is extremely effective. If you’ve ever held your breath to see if your PowerPivot formula has worked, and felt the sense of relief of realising that the red ball means ‘Success’ in 2008 R2, then you’ll know what I mean! I am hoping that the tick will cater for colour-blind people as feedback if the green colour is an issue, and I look forward to your comments on this issue.

Little things make me happy!

Business data: 2D or 3D?

One debate in data visualisation can be found in the deployment of 2D or 3D charts. Here is an interesting assessment here, conducted by Alasdair Aitchison, and it is well worth a read.
3D visualisations are good for certain types of data e.g. spatial data. One good example of 3D in Spatial analysis is given by Lie, Kehrer and Hauser (2009) who provide visualisations of Hurricane Isabel. 3D has also been shown to be extremely useful for medical visualisation, and there are many examples of this application. One example for many parents is a simple, everyday miracle: anyone who has known the experience of seeing their unborn child on a screen will be able to tell you of the utter joy of seeing their healthy child grow in the womb via the magic of medical imaging technology. Another example of this work has been conducted in cancer studies, where the researchers have visualised tumours in order to detect brain tumours (Islam and Alias, 2010). 
For me, data visualisation is all about trying to get the message of the data out to as many people as possible. Think John Stuart Mill’s principle of utilitarianism – the maximum happiness to the most amount of people. In data visualisation, similar applies; we can make people happy if they get at their data. However, for the ‘lay public’ and for business users, 3D isn’t good for business data because people just don’t always ‘get’ it easily. Note that medical staff do undertake intensive training in order to assess scans and 3D images, and this subset is excluded from the current discussion, as is spatial data. Hopefully, by restricting the ‘set’ of users to business users, the argument goes from the general to the specific, where it is easier to clarify and give firmer answers to the ‘grey’ subject of data visualisation.
Data Visualisation is not about what or how you see; it’s ‘other-centric’. It’s about getting inside the head of the audience and understanding how to help them see the message best. It is often difficult to judge what business users – or people in general – will find easiest to understand. It is also difficult to ascertain what visualisations can best support a given task. Ultimately, I like to stick to the best practices in order to try and answer the data visualisation question as well as possible and to make things as clear for everyone as possible.
Part of my passion for data visualisation comes from personal experience; I was told when I was quite young that I was going blind in one eye. Fortunately, this proved not to be the case, and I can see with two eyes. When my son was born, I saw him with two eyes, and for that I am extremely grateful. Having been through the experience of learning that I may go through life with impaired vision, I have been blessed to understand how precious our vision is, and to try and do something positive for others who have struggled with their vision. This experience has made me passionate about trying to make things as clear for everyone else as possible, so I guess the personal experience has made me so passionate about making data visualisation accessible to everyone, as far as possible.
One particularly relevant issue in data visualisation is the  debate over 2D over 3D – namely, whether to use 3D in data visualisation or not. Here, I specifically refer to the visualisation of business data, not Infographics. 
On one hand, 3D can make a chart or dashboard look ‘pretty’ and interesting. In today’s world, where we are bombarded with images and advanced graphical displays, we are accustomed to expecting ‘more’ in terms of display. We do live in a 3D world, and our visual systems are tuned to perceive the shapes of a 3D environment (Ware, 2004). 
The issue comes when we try to project 3D onto a 2D surface; we are trying to add an additional plane onto a 2D surface. This is a key issue in data visualisation, since we are essentially trying to represent high-dimensional entities onto a two-dimensional display, whether it is a screen or paper. 
Generally speaking, 3D takes longer for people to assimilate than 2D graphs, and they are more difficult to understand. Not everyone has good eyesight or good innate numerical ability, and its’ about getting the ‘reach’ of the data to as many people as possible without hindering or patronising. Perceptually, 2D is the simplest option, and the occlusion of data points is not an issue. Business users are also often more familiar with this type of rendering and it is the ‘lowest common denominator’ in making the data approachable to the most number of people. 
On the other hand, there is some evidence to suggest 3D graphs can, on occasion, be more memorable initially, but this isn’t any good if the data wasn’t understood properly in the first place. It can also be more difficult to represent labels and textual information about the graph. 
In terms of business data, however, 3D Graphs can break ‘best practice’ on a number of issues:
 – Efficiency. Numbering is inefficient since it can be difficult to compare. “Comparison is the beating heart of analysis” (Few) In other words, we should be trying to help users to get at their data in a way that facilitates comparison. If comparison isn’t facilitated, then this can make it more difficult for the users to understand the message of the data quickly and easily.
 – Meaningful. A graph should require minimum explanation. If users take longer to read it, and it increases cognitive load, then it can be difficult to draw meaningful conclusions. The introduction of 3D can mean chartjunk, which artificially crowds the ‘scene’ without adding any value. If you crowd the ‘scene’, then this can naturally distract rather than inform.
 – Truthful. The data can be distorted; occluding bars are just one example. If the labels are not correctly aligned or have labels missing, this can also make the 3D chart difficult to read.
 – Aesthetics. It can make the graph look pretty but there are other ways to do this which don’t distract or occlude.
Stephen Few has released a lot of information about 3D and I suggest that you head over to his site and take a look. Alternatively, I can recommend his book entitled ‘Now you See it‘ for a deeper reading since it describes these topics in more detail, along with beautiful illustrations to allow you to ‘see’ for yourself.
To summarise, what should people do – use 2d only? Here is the framework of a strategy towards a decision:
 – Look at the data. The data might be astrophysics data, in which the location of the stars, and its type, could be identified by colour and brightness as well as location. If the data is best suited to 3D, such as spatial, astrophysics or medical data, then that’s the right thing to do. If the data is business data, where it is important to get the ‘main point’ across as clearly and simply as possible, then 2D is best since it reduces the likelihood of misunderstandings in the audience. Remember that not everyone will be as blessed with good sight or high numerical ability as you are!
Look at the audience. 3D can be useful if the audience are familiar with the data. I had a look at Alastair’s 3D chart and I have to say that I am not sure what the chart is supposed to show, probably because I’m not clear on the data. I am not an expert in spatial data, so I don’t ‘get’ it. So I ask for Alastair’s understanding in my perspective that I don’t understand the spatial data in his blog, so I will be glad to defer to his judgement in this area (no pun intended). If you can’t assume that the viewers are familiar with the data, then it’s probably common sense to make it as simple as possible.
 – Look at the Vendors. Some vendors, e.g. Tableau, do not offer 3D visualisations at all, and bravely take the ‘hit’ from customers, saying that they are sticking to best practice visualisations and that’s the second, third, fourth, fifth and final opinion on the matter. 
In terms of multi-dimensional data representation, there are different methodologies in place to display business data that don’t require 3D, such as parallel co-ordinates, RadViz, lattice charts, sploms, scattergrams. I have some examples on this blog and will produce more over time. Further, it is also possible to filter and ‘slice’ the data in order to focus it towards the business question at hand, so that it is easier for business users to understand. 
I hope that SQL Server Denali Project Crescent will help business users to produce beautiful, effective and truthful representations of business data. I believe that business users will eventually start doing data visualisations ‘by default’ because it is inbuilt to the technology that they are using. Think of sparklines, which are now availabe in Excel 2010 – this was exciting stuff for me! Hopefully Project Crescent will go down this route towards excellent data visualisation but I recognise it will take time.
To summarise, the way around the ‘3D or not to 3D’ in business data is to offer such beautiful, effective, truthful visualisations of business users’ data that adding 3D wouldn’t add anything more to them. The focus here has been on business users, since that’s where my experience lies; there are plenty of good examples of 3D in spatial, astrophysics and medical imaging, but my focus is on business users . 
To conclude, my concern is to get the message of the data is clearly put across to the maximum number of people – think John Stuart Mill again!

PowerPivot Denali – Upgrading from SQL Server 2008 R2 and KPIs

This blog is part of a series in which I will share my experiences in the move from PowerPivot in SQL Server 2008 R2 to SQL Server ‘Denali’.  As always, your comments are welcome! In this segment, I will explore the upgrade itself, and some new functionality in PowerPivot – the creation of KPIs.
The Upgrade
The upgrade from SQL Server 2008 R2 to PowerPivot ‘Denali’ couldn’t be easier. The upgrade was simply a matter of taking a copy of my Excel file with PowerPivot, and opening it in PowerPivot Denali. When the *.xlsx file is opened in Denali, the following prompt appears:

1. Initial Upgrade from previous PowerPivot

To upgrade, click ‘Ok’ This was straightforward; there was a little bar at the bottom right hand side of the screen. The only tiny criticism, I’d say, is that the progress of the upgrade wasn’t immediately clear to me, and I wasn’t sure if it had correctly upgraded or not until I saw all of the PowerPivot buttons fully appear in the ribbon. If I could change things at this point, it would be to provide ‘in your face’ feedback that the upgrade was in progress, and successful.
Once the upgrade is completed, PowerPivot fans are in for a real treat!  The interface looks crisp and there is new functionality to be explored. Next, we will look at the creation of KPIs, which is very simple.
KPIs in PowerPivot Denali
This section will focus on a very simple creation of a KPI using PowerPivot Denali.  The KPI will take the value of Order Margin Percent. The data source is the AdventureWorks Denali Data Warehouse, which can be downloaded from Codeplex here.
Essentially the KPI takes the Order Margin, and calculates its percentage of the whole Sales Amount. Here is a closer look at the actual measure here:

2. Check RELATED formula with green ball

To create a KPI is very simple in PowerPivot Denali; there are two ways:
a. click on the Measure and select ‘Create KPI’ in the ribbon
b. right-click on the Measure and select ‘Create KPI’ in the pop up menu.
Here, we will create a KPI business rule quite simply says:
If the Percentage is less than 41%, then it is a ‘red’ KPI, meaning that the status is critical: (red)
If the Percentage is equal to or greater than 41% but less than 86%, then the status is warning: (yellow)
If the Percentage is equal to or greater than 86%, then the status is successful: (green)
This is implemented in the graphic below:

6. PowerPivot Denali Configure KPI Volume

Note that the ‘Absolute Value’ is set to 1, not 100; and the percentages are specified in the decimals rather than as percentage values. Hopefully users won’t get confused, since if the percentages are specified rather than the decimal values, then they might wonder why their KPI value is not working.

If we choose the red-yellow-green ‘traffic symbols’ then our report appears as follows. If it is hard to read, please do click on the image to go to my flickr blog.

7. PowerPivot Denali End Result

Creating KPIs in PowerPivot is extremely easy to do, and I achieved some impactful results in just a few steps. It didn’t require any typing so if you are most comfortable with directly interacting  with the interface to produce the KPI, then this is the tool for you.

The other side of the coin is that, as readers of the blog will know, I’m not a fan of red-yellow-green since colour blind people have issues in seeing these colours. It is also possible that people with strong shortsighted prescriptions in their glasses can have a ‘rainbow’ like prism effect if they look at an image off-axis. This is known as chromatic aberration, and is a result of a prismatic separation of colours, which appears as a prism of strongly-contrasting colours.  As the individual moves their head, the prismatic effect of the colours can change, which can distort the image.
This is the basis of the Duochrome test, which uses chromatic aberration to identify short-sightedness. Most people are familiar with this: here is an example:

X F J S U O
X F J S U O

Here are some generalisations – there will always be specific cases that break the generalities! Generally speaking, very short sighted people will see the red image more clearly, and if the eye is corrected properly, then both lines appear equally sharp. In short sighted people, the axial length can be longer, which means that the light does not focus on the retina; thus short sighted people can be more impacted by focal length of the blue light. 
Hence the red-green debate has some basis in the ways in which our eyes work. I understand that PowerPivot KPIs are still in ‘early visibility’ stage to everyone, but I have my fingers crossed that the KPIs will be able to be amended. Here is another version that I could do with the existing functionality. If it is difficult to see, please click on the image to go to my flickr account:

8. PowerPivot Denali End Result

In this example, I have tried to go with the ‘longer length equals higher value’ approach, and not used any colour to distinguish the KPI statuses. Ideally, I would like to make these icons go ‘left to right’ in order to facilitate comparison between the Years or Row Labels. I would also be able to choose red-blue colours to distinguish between statuses properly. Let’s see what happens!

In my next post, I will be covering more new PowerPivot features in Denali. In the meantime, I look forward to your comments.

Dashboard Design using Microsoft Connect item data for SQL Server Denali

I am presenting at the SQLPass ’24Hop’ ‘Women in Technology‘ event on March 15th 2011. The topic is Dashboard Design and Practice using SSRS, and this blog is focused on a small part of the overall SQLPass presentation. Here, I will talk about some of the design around a dashboard which displays information about the Microsoft Connect cases focused on SQL Server Denali. Before I dive in, this dashboard was produced  as a team effort between myself, who did the data visualisation, and Nic Cain and Aaron Nelson, who bravely got me the data, sanitised it, and served it up for consumption by the dashboard, and also Rob Farley, who helped put us in touch with one another. So I wanted to say ‘Thank you’ to the guys for their help, and if you like it, then please tweet them up to say ‘Thank You’ too 🙂 You’ll find them at Aaron Nelson (Twitter),  Nic Cain (Twitter) and Rob Farley (Twitter). 


Before we begin, here is the dashboard:


Connect_Items_Dashboard

Well, what is a dashboard? At first, it simply looks like a set of reports nailed together on a page  However, this misses an important point about dashboards, which is that they give the viewer something which is ‘over and above’ the individual reports give to the data consumer. A dashboard can mean different things to different people. There are a number of different types of dashboard, which are listed here:



Strategic Dashboard – overview of how well the business is performing
Faceted Analytical Display – multi-chart analytical display (Stephen Few’s terminology) This will be discussed in more depth next.
Monitoring Dashboard – this displays reactionary information for review only; this data is often short-term, perhaps a day old or less.


Each dashboard type has got the following elements in common:

  • Dashboards are intended to provide ‘actionability’ in addition to insight; to help the data consumer, to have insight into the presented data.  
  • The reports on the page support a ‘theme’, which is the fundamental business question which is answered by the dashboard.  In other words, what is it that the business need to know, and what is it that they need to act upon? 
  • Further, the dashboard should rest on a fundamental data model, which has data that is common to all of the reports; the reports should not be completely disparate. If this occurs, then the data’s message may become diluted as distractions are added.  
In order to explore the idea of the Faceted Analytical Display, I have used data from Microsoft Connect items, which are focused on SQL Server Denali. This dashboard shows us different perspectives on the numbers, types and statuses of Connect items opened for SQL Server Denali.  In order to understand more, it is possible to select relevant years on the right hand side, in order to show how the data has changed over time.  If you click on the image below, it will take you to the Tableau Public website so that you can have a play for yourself!

Thus, this dashboard type is, in Stephen Few’s terminology, a “faceted analytical display”. Few defines this as a set of interactive charts (primarily graphs and tables) that simultaneously reside on a single screen, each of which presents a somewhat different view of a common dataset, and is used to analyse that information. I recommend that you head over to his site in order to read more about the definitional issues around dashboard, along with practical advice regarding their construction. 

This dashboard isn’t a straightforward ‘Monitoring’ dashboard, because it does allow some analysis. It is also possible to ‘brush’ the data, which means that it is possible to highlight some bars and dashboard elements at the expense of other elements.  There are other considerations in the creation of the dashboard:

Colour – I used a colour-blind palette, so there are no reds or greens. Orange and blue are ‘safe’ perceptually distinct colours. At the foot of the dashboard, the same colours were assigned to Connect call status. So, ‘Fixed’ has the same colour for both ‘Closed’ and ‘Resolved’ connect calls, as this is the same for the other status types.

Bar charts – for representing quantity and for the purposes of reading left-to-right, and for facilitating comparison within dashboard elements. 

Continuous data – the number of Connect items opened at any point is given as a continuous line chart. This line chart is interesting, since it shows that the number of Connect items has increased dramatically since the start of 2011. It’s great that everyone is getting involved by raising Connect items!

I will be interested in your feedback; please leave a comment below!
Jen x








Project Crescent in Denali: BISM Summary

There has been a lot of buzz from SQLPass and in the SQL Server community about the Business Intelligence Semantic Model (BISM), which will be used to ‘power’ access to the data for the Microsoft Business Intelligence applications such as Excel, Reporting Services (SSRS) and Sharepoint. It is also intended that Project Crescent, the new self-service ad-hoc reporting tool available in SQL Server Denali, will be powered by the BISM.

Following on from some recent blogs, I was pleased to receive some direct questions from some business-oriented people, who wanted to know more about the ‘how’ of the BISM. It’s clear to me that business users are interested in how it will impact them.  The focus of this blog is to take the information from people like Chris Webb, Teo Lachev, Marco Russo and TK Anand, who have written clear and trusted accounts of the SQL Server Denali information thus far, and use it as a foundation to answer the questions from business-oriented users that I’ve received so far. So, to business!


How can I access the BISM?

The Data Model Layer – this is what users connect to. This is underpinned by two layers:

The Business Logic Layer – encapsulates the business rules, which is supported by:

The Data Access Layer  –  the point at the data is integrated from various sources.

TK Anand has produced a nice diagram of the inter-relationships, and you can head over to his site to have a look.



How do I create a Business Intelligence Semantic Model?

This is done via an environment, which is essentially a new version of the BIDS called Project Juneau. 

It is also possible to produce a BISM Model using Excel PowerPivot, which will help you to construct the relationships and elements contained in the model, such as the business calculations. This is done using Data Analysis Expressions (DAX). This helps you to form simple calculations through to more complex business calculations such as Pareto computations, ranking, and time intelligence calculations. If you would like to know DAX in-depth, then I suggest that you have a look at the book entitled Microsoft PowerPivot for Excel 2010 by Marco Russo and Alberto Ferrari. This book is accessible in its explanations of the DAX constructions. Thank you to Sean Boon for his commentary on the involvement of PowerPivot in creating BISM models.


How up-to-date is the data? Is it cached or accessed in real-time?

The Data Model Layer, accessed as the fundamental part of the BISM, can be cached or accessed in real-time. The main take away point is as follows:

Cached method: the upshot of which is that it is very, very fast to access the cached data. At the SQLPASS event, the demo showed instant querying on a 2 billion row fact table on a reasonable server. Specifically, the speed is because it uses the Vertipaq store to hold the data ‘in memory’. 

Real-time method: the queries go straight through the Business Logic Layer to go and get the data for the data navigator. 

A potential downside of the cached method is that the data needs to be loaded into the Vertipaq ‘in memory’ store for access. It’s not clear how long this will take so it is sounding like a ‘how long is a length of string?’ question; in other words, it depends on your data I suppose. Other technologies, like Tableau, also use in-memory data stores and data extracts. For example, Tableau offers you more calculations, such as CountD, if you use the data extracts instead of touching the source systems, thereby encouraging you to use their own data stores. In Denali, I will be interested to see if there are differences in the calculations offered by the cached or real-time method. 

To summarise, a careful analysis of the requirements will help to determine the methodology that your business needs. In case you need more technical detail, this BISM, in-memory mode is a version of SQL Server Analysis Services. If you require more details, I would head over to Chris Webb’s site.


How can I access the BISM without Sharepoint?


In SQL Server Denali, it will be possible to install a standalone instance of the in-memory, BISM mode. Essentially, this is a version of Analysis Services which does not need Sharepoint. Until more details are clarified, it isn’t possible to say for certain how this version differs from the Sharepoint-specific version. No doubt that will become more clear. 

As an aside, I personally love Sharepoint and I think that users can get a great deal from it generally, and not just in the Business Intelligence sphere. I would want to include Sharepoint implementations as far as possible in any case.


What will BISM give me?


Project Crescent: The big plus is Project Crescent, which is the new ad-hoc data visualisation tool, which is planned to look only visualise data via the BISM. Although you don’t need Sharepoint to have a BISM, you do need it if you want to use Project Crescent. 

Hitting the low and high notes: If you’ve ever had to produce very detailed, granular reports from a cube, then you will know that this can take time to render. The BISM will be able to serve up the detailed level data as well as the aggregated data, thereby hitting both notes nicely!

Role-based security: this will be available, in which it will be possible to secure tables, rows or columns. As an aside, it will be important to plan out the roles and security so that this maps business requirements around who can see the data.



What will BISM not give me?

As I understand it, it will not support very advanced multi-dimensional calculations in Denali since it is not as multidimensional as its more mature Analysis Services sibling, the Unified Dimensional Model (UDM). Like most things, if it is simpler to use, it won’t be as advanced as more complex facilities. This can be an advantage since it will be easier for many relational-oriented people to understand and access, especially for straightforward quick reports.

I hope that helps to answer the various questions I have received; if not, please don’t hesitate to get in touch again!

Project Crescent in Denali: A Kuhnian paradigm shift for business users?

What is Project Crescent? The Microsoft SQL Server team blog describes Project Crescent as a ‘stunning new data visualisation experience’ aimed at business users, by leveraging the ‘self-service business intelligence’ features available in PowerPivot. The idea is to allow business users to serve themselves to the data by interacting, exploring and having fun with it. The concept at the heart of Project Crescent is that “Data is where the business lives!” (Ted Kummert), so business users have access to the data directly.  

For many users, this new methodology of data visualisation could be a real fundamental change in their way of looking at data, a real Kuhnian paradigm shift; instead of using basic reports, instead accessing a self-service way of understanding their data without a reliance on IT, and without invoking a waterfall methodology to get the data that they require in order to make strategic decisions.

What does this data visualisation actually mean for business users, however? Haven’t business users already got their data, in the form of Excel, Reporting Services, and other front-end reporting tools? The answer to this question really depends on the particular reporting and data analysis process deployed by the business users. Let’s use an analogy to explain this. In the ‘Discworld’ series of books’ by Terry Pratchett, one book called ‘Mort’ contains a joke about the creation of the Discworld being similar to creating a pizza. In other words, the Creator only intended to create night and day, but got carried away by adding in the sea, birds, animals and so on; thus, the final outcome was far beyond the initial plan. The book continues that the process was similar to making a pizza, whereby the initial outcome was only intended to be ‘cheese and tomato’ but the creator ends up impulsively adding in all sorts of additional toppings. Thus, the final result is something that was over and above the original intention. Similarly, reporting and data analysis can be analogous to this process, whereby the original planned outcome is surpassed by the addition of new findings and extrapolations that were not originally anticipated.

Put another way, there are two main ways of interacting with data via reporting; one is structured reporting, and the other is unstructured data analysis. In the first ‘structured’ route, the report is used to answer business questions such as ‘what were my sales last quarter?’ or ‘how many complaint calls did I receive?’ Here, the report takes the business user down a particular route in order to answer a specific question. This process is the most commonly used in reporting, and forms the basis of many strategic decisions. If this was Discworld, this is your base ‘cheese and tomato’ pizza!

On the other hand, unstructured data analysis allows the business user to take a look and explore the data without a preconceived business question in their heads. This allows the data to tell its own story, using empirical evidence based on the data, rather than using pre-conceived ideas to generate the data.  In our ‘Discworld’ analogy, this would be the final ‘toppings and all’ pizza, that contained so much more than the original intention.

So, Project Crescent is great news for business users for a number of reasons:

 – users will be able to use ‘self-service’ to create their own reports, with no reliance on IT staff
 – users will be able do ad-hoc analysis on their data without being taken down a particular road by a structured report
 – the traditional ‘waterfall’ methodology of producing reports can be more easily replaced with an agile business intelligence methodology, since prototypes can be built quickly and then revised if they do not answer the business question.

At the time of writing, it is understood that the data visualisation aspect of Project Crescent will involve advanced charting, grids and tables. The users will be able to ‘mash up’ their data in order to visualise the patterns and outliers that are hidden in the data. Although it is difficult to quantify for a business case, it is interesting to note the contributions that visualisation has made to understanding data – or even occluding it, in some cases. One example is in the discovery of DNA: Rosalind Franklin’s photographs of the DNA structure revealed the double helix, which was examined and found by Crick and Watson.  This has had enormous contributions of this finding to our understanding of science. On the other hand, incorrect data visualisation has been proposed as a contributor to the decision making processes potentially leading to the Challenger disaster by Edward Tufte.

So far, it sounds like a complete ‘win’ for the business users. However, it may be a Kuhnian ‘paradigm shift’ in a negative way for some users, in particular for those people who rely on intuition rather than empirical, data-aware attitudes to make strategy decisions. In other words, now that the ‘self-service’ Business Intelligence facilities of PowerPivot and Project Crescent are available, business users may find that they need to become more data-oriented when making assertions about the business. This ‘data-focused’ attitude will be more difficult for business users who use a ‘gut feel’, or intuition, to make their assertions about the business. This is particularly the case where business users have been with a company for a long time, and have a great deal of domain knowledge. 

It is also important to understand that the ‘base’ reporting function is still crucial, and no business can function without the basic reporting functionality. Thus, Reporting Services, whether facilitated through Sharepoint or ‘native’, along other reporting tools, will still be an essential part of the business intelligence owned by enterprises. If this is Discworld, this would be our ‘cheese and tomato’ pizza. Put another way, there would be no pizza if there wasn’t for the base!

Terry Pratchett commented a while ago that ‘he would be more enthusiastic about encouraging thinking outside the box when there’s evidence of any thinking going on inside it’. This is very true of business intelligence systems, as well as the processes of creative writing. The underlying data needs to be robust, integrated and correct. If not, then it will be more difficult for business users to make use of their data, regardless of the reporting tool that is being used. In other words, the thinking ‘inside’ the box needs to be put in place before the ‘out of the box’ data analysis and visualisation can take place.

Project Crescent will be released as a part of SQL Server Denali, and will be mobilised by Sharepoint and the Business Intelligence Semantic Model (BISM). A final release date for SQL Server Denali is still being negotiated, but I hope to see it in 2011. To summarise, Project Crescent offers a new way of visualising and interacting with data, with an emphasis on self-service – as long as there is thinking inside the box, of course, regardless of whether you live in Discworld or not!