Project Crescent in Denali: BISM Summary

There has been a lot of buzz from SQLPass and in the SQL Server community about the Business Intelligence Semantic Model (BISM), which will be used to ‘power’ access to the data for the Microsoft Business Intelligence applications such as Excel, Reporting Services (SSRS) and Sharepoint. It is also intended that Project Crescent, the new self-service ad-hoc reporting tool available in SQL Server Denali, will be powered by the BISM.

Following on from some recent blogs, I was pleased to receive some direct questions from some business-oriented people, who wanted to know more about the ‘how’ of the BISM. It’s clear to me that business users are interested in how it will impact them.  The focus of this blog is to take the information from people like Chris Webb, Teo Lachev, Marco Russo and TK Anand, who have written clear and trusted accounts of the SQL Server Denali information thus far, and use it as a foundation to answer the questions from business-oriented users that I’ve received so far. So, to business!


How can I access the BISM?

The Data Model Layer – this is what users connect to. This is underpinned by two layers:

The Business Logic Layer – encapsulates the business rules, which is supported by:

The Data Access Layer  –  the point at the data is integrated from various sources.

TK Anand has produced a nice diagram of the inter-relationships, and you can head over to his site to have a look.



How do I create a Business Intelligence Semantic Model?

This is done via an environment, which is essentially a new version of the BIDS called Project Juneau. 

It is also possible to produce a BISM Model using Excel PowerPivot, which will help you to construct the relationships and elements contained in the model, such as the business calculations. This is done using Data Analysis Expressions (DAX). This helps you to form simple calculations through to more complex business calculations such as Pareto computations, ranking, and time intelligence calculations. If you would like to know DAX in-depth, then I suggest that you have a look at the book entitled Microsoft PowerPivot for Excel 2010 by Marco Russo and Alberto Ferrari. This book is accessible in its explanations of the DAX constructions. Thank you to Sean Boon for his commentary on the involvement of PowerPivot in creating BISM models.


How up-to-date is the data? Is it cached or accessed in real-time?

The Data Model Layer, accessed as the fundamental part of the BISM, can be cached or accessed in real-time. The main take away point is as follows:

Cached method: the upshot of which is that it is very, very fast to access the cached data. At the SQLPASS event, the demo showed instant querying on a 2 billion row fact table on a reasonable server. Specifically, the speed is because it uses the Vertipaq store to hold the data ‘in memory’. 

Real-time method: the queries go straight through the Business Logic Layer to go and get the data for the data navigator. 

A potential downside of the cached method is that the data needs to be loaded into the Vertipaq ‘in memory’ store for access. It’s not clear how long this will take so it is sounding like a ‘how long is a length of string?’ question; in other words, it depends on your data I suppose. Other technologies, like Tableau, also use in-memory data stores and data extracts. For example, Tableau offers you more calculations, such as CountD, if you use the data extracts instead of touching the source systems, thereby encouraging you to use their own data stores. In Denali, I will be interested to see if there are differences in the calculations offered by the cached or real-time method. 

To summarise, a careful analysis of the requirements will help to determine the methodology that your business needs. In case you need more technical detail, this BISM, in-memory mode is a version of SQL Server Analysis Services. If you require more details, I would head over to Chris Webb’s site.


How can I access the BISM without Sharepoint?


In SQL Server Denali, it will be possible to install a standalone instance of the in-memory, BISM mode. Essentially, this is a version of Analysis Services which does not need Sharepoint. Until more details are clarified, it isn’t possible to say for certain how this version differs from the Sharepoint-specific version. No doubt that will become more clear. 

As an aside, I personally love Sharepoint and I think that users can get a great deal from it generally, and not just in the Business Intelligence sphere. I would want to include Sharepoint implementations as far as possible in any case.


What will BISM give me?


Project Crescent: The big plus is Project Crescent, which is the new ad-hoc data visualisation tool, which is planned to look only visualise data via the BISM. Although you don’t need Sharepoint to have a BISM, you do need it if you want to use Project Crescent. 

Hitting the low and high notes: If you’ve ever had to produce very detailed, granular reports from a cube, then you will know that this can take time to render. The BISM will be able to serve up the detailed level data as well as the aggregated data, thereby hitting both notes nicely!

Role-based security: this will be available, in which it will be possible to secure tables, rows or columns. As an aside, it will be important to plan out the roles and security so that this maps business requirements around who can see the data.



What will BISM not give me?

As I understand it, it will not support very advanced multi-dimensional calculations in Denali since it is not as multidimensional as its more mature Analysis Services sibling, the Unified Dimensional Model (UDM). Like most things, if it is simpler to use, it won’t be as advanced as more complex facilities. This can be an advantage since it will be easier for many relational-oriented people to understand and access, especially for straightforward quick reports.

I hope that helps to answer the various questions I have received; if not, please don’t hesitate to get in touch again!

Project Crescent in Denali: A Kuhnian paradigm shift for business users?

What is Project Crescent? The Microsoft SQL Server team blog describes Project Crescent as a ‘stunning new data visualisation experience’ aimed at business users, by leveraging the ‘self-service business intelligence’ features available in PowerPivot. The idea is to allow business users to serve themselves to the data by interacting, exploring and having fun with it. The concept at the heart of Project Crescent is that “Data is where the business lives!” (Ted Kummert), so business users have access to the data directly.  

For many users, this new methodology of data visualisation could be a real fundamental change in their way of looking at data, a real Kuhnian paradigm shift; instead of using basic reports, instead accessing a self-service way of understanding their data without a reliance on IT, and without invoking a waterfall methodology to get the data that they require in order to make strategic decisions.

What does this data visualisation actually mean for business users, however? Haven’t business users already got their data, in the form of Excel, Reporting Services, and other front-end reporting tools? The answer to this question really depends on the particular reporting and data analysis process deployed by the business users. Let’s use an analogy to explain this. In the ‘Discworld’ series of books’ by Terry Pratchett, one book called ‘Mort’ contains a joke about the creation of the Discworld being similar to creating a pizza. In other words, the Creator only intended to create night and day, but got carried away by adding in the sea, birds, animals and so on; thus, the final outcome was far beyond the initial plan. The book continues that the process was similar to making a pizza, whereby the initial outcome was only intended to be ‘cheese and tomato’ but the creator ends up impulsively adding in all sorts of additional toppings. Thus, the final result is something that was over and above the original intention. Similarly, reporting and data analysis can be analogous to this process, whereby the original planned outcome is surpassed by the addition of new findings and extrapolations that were not originally anticipated.

Put another way, there are two main ways of interacting with data via reporting; one is structured reporting, and the other is unstructured data analysis. In the first ‘structured’ route, the report is used to answer business questions such as ‘what were my sales last quarter?’ or ‘how many complaint calls did I receive?’ Here, the report takes the business user down a particular route in order to answer a specific question. This process is the most commonly used in reporting, and forms the basis of many strategic decisions. If this was Discworld, this is your base ‘cheese and tomato’ pizza!

On the other hand, unstructured data analysis allows the business user to take a look and explore the data without a preconceived business question in their heads. This allows the data to tell its own story, using empirical evidence based on the data, rather than using pre-conceived ideas to generate the data.  In our ‘Discworld’ analogy, this would be the final ‘toppings and all’ pizza, that contained so much more than the original intention.

So, Project Crescent is great news for business users for a number of reasons:

 – users will be able to use ‘self-service’ to create their own reports, with no reliance on IT staff
 – users will be able do ad-hoc analysis on their data without being taken down a particular road by a structured report
 – the traditional ‘waterfall’ methodology of producing reports can be more easily replaced with an agile business intelligence methodology, since prototypes can be built quickly and then revised if they do not answer the business question.

At the time of writing, it is understood that the data visualisation aspect of Project Crescent will involve advanced charting, grids and tables. The users will be able to ‘mash up’ their data in order to visualise the patterns and outliers that are hidden in the data. Although it is difficult to quantify for a business case, it is interesting to note the contributions that visualisation has made to understanding data – or even occluding it, in some cases. One example is in the discovery of DNA: Rosalind Franklin’s photographs of the DNA structure revealed the double helix, which was examined and found by Crick and Watson.  This has had enormous contributions of this finding to our understanding of science. On the other hand, incorrect data visualisation has been proposed as a contributor to the decision making processes potentially leading to the Challenger disaster by Edward Tufte.

So far, it sounds like a complete ‘win’ for the business users. However, it may be a Kuhnian ‘paradigm shift’ in a negative way for some users, in particular for those people who rely on intuition rather than empirical, data-aware attitudes to make strategy decisions. In other words, now that the ‘self-service’ Business Intelligence facilities of PowerPivot and Project Crescent are available, business users may find that they need to become more data-oriented when making assertions about the business. This ‘data-focused’ attitude will be more difficult for business users who use a ‘gut feel’, or intuition, to make their assertions about the business. This is particularly the case where business users have been with a company for a long time, and have a great deal of domain knowledge. 

It is also important to understand that the ‘base’ reporting function is still crucial, and no business can function without the basic reporting functionality. Thus, Reporting Services, whether facilitated through Sharepoint or ‘native’, along other reporting tools, will still be an essential part of the business intelligence owned by enterprises. If this is Discworld, this would be our ‘cheese and tomato’ pizza. Put another way, there would be no pizza if there wasn’t for the base!

Terry Pratchett commented a while ago that ‘he would be more enthusiastic about encouraging thinking outside the box when there’s evidence of any thinking going on inside it’. This is very true of business intelligence systems, as well as the processes of creative writing. The underlying data needs to be robust, integrated and correct. If not, then it will be more difficult for business users to make use of their data, regardless of the reporting tool that is being used. In other words, the thinking ‘inside’ the box needs to be put in place before the ‘out of the box’ data analysis and visualisation can take place.

Project Crescent will be released as a part of SQL Server Denali, and will be mobilised by Sharepoint and the Business Intelligence Semantic Model (BISM). A final release date for SQL Server Denali is still being negotiated, but I hope to see it in 2011. To summarise, Project Crescent offers a new way of visualising and interacting with data, with an emphasis on self-service – as long as there is thinking inside the box, of course, regardless of whether you live in Discworld or not!

Project Crescent: When is it best applied?

From what I’ve read and understood, Project Crescent is best applied to ad-hoc, undirected analysis work rather than directed reports, displaying current operational data activity. This blog aims to understand more about Project Crescent, and when it is best applied to business questions.

From the front-end reporting perspective, Project Crescent is a major new Business Intelligence addition to Microsoft SQL Server Denali. It is primarily designed to assist business users, across an organisation, to ‘storyboard’ their data by facilitating unstructured analyses of the data, and to share it in the Microsoft Office applications that we are familiar with already. Project Crescent has its genesis in the SSRS team, and is focused on allowing business users to analyse, mine and manipulate their data ‘on the fly’. The solution will be browser-based, using Vertipaq and Silverlight to product rapid BI results by allowing users to interact with their data. Further, this analysis will be powered by the Business Intelligence Semantic Model (BISM), and will only be available in Sharepoint mode. This criteria would need to be met in order to use Project Crescent, so this may not work in every environment.


How does this perspective differ from reporting? Well, standard reporting takes business users down a particular structured route; for example, the parameters are pre-prepared in the report in drop-down format, and the dataset is already ‘set’ about the data that can be shown. This offers limited options for business users to interact with the data. On the other hand, Project Crescent will allow unstructured analysis of the data, which does not lead business users down a particular route. Rather than produce reports, the unstructured analysis does not offer a real separation between ‘design’ and ‘preview’. Instead, the users ‘play’ with their data as they go along, ‘thinking’ as fast as they ‘see’.

‘Unstructured’ type of data analysis is great with most data sources. For straightforward operational needs, the structured reporting facillities will answer the immediate, operational business question.  This is particularly the case when considering the display of data from an Operational Data Store (ODS). What does the ODS give you? The ODS basically helps support people who conduct business at the operational level, who need the information to react quickly. Essentially, it sits between the source systems and the data warehouse downstream, and the data usually has a short shelf-life. It is an optional aspect of the business intelligence architecture, and worth considering since many organisations have at least one ODS. There are many different definitions for the ODS, but here are some common chacteristics:

– it intakes data from disparate sources, cleanses and integrates them
– it is designed to relieve pressure from source systems
– stores data that isn’t yet loaded into the data warehouse, but is still required now

One example of this ODS deployment is in contact centre management. For example, the contact centre manager will need to know if there is a high abandoned call rate at the contact centre. As a result of this, they may need to temporarily direct calls to another team in order to meet the unusually high demand. To do this, however, they need to be aware that the abandoned call rate has reached a particular threshold. This reporting requirement is operational rather than strategic; a strategic decision may involve using a number of metrics to identify whether some longer-term re-organisation is required, but the ODS helps the business to run optimally here and now. 

This leads us to another question; how can we best display the data in the ODS, and display it? Data visualisation of ODS information has very specific requirements since it needs to show, at a glance, how the business is operating.  There are a number of different data visualisations which can assist in showing ‘at-a-glance’ information very effectively. There are plenty examples of data visualisations that can assist in the ‘here and now’ business question facilitation. For example, for information on bullet charts to display target information, please see an earlier blog. It is also straightforward to produce KPIs in Reporting Services, and if you’re interested, please see this post by Jessica Moss. and so on. In SQL Server Denali, these visualisations can be produced using SSRS, Sharepoint, Excel, and PowerPivot, not to mention a whole range of other third-party applications such as Tableau. There are more details about the Microsoft-specific technologies in the SQL Server Denali roadmap, which can be found here on TK Anand’s Blog

To summarise, from what I’ve read so far, Project Crescent is aimed more at ad-hoc analysis work rather than displaying current operational data activity. From the roadmap, I gather that Microsoft SQL Server Denali will have something to please the business users who require ‘unstructured’ data analyses, in addition to the high standard of functionality already available for those who require structured data analyses in response to a changing business environment. I look forward to its release!

Project Crescent in Denali: great news for business users, but what about Tableau?

I’m often asked, as I was yesterday by Rob Farley: do you think that Microsoft would consider purchasing Tableau? For those of you who don’t know, Tableau is ‘rapid-fire’ self-service business intelligence, aimed at the business user masses; simple data visualisation, made easy with connections to the Microsoft DataMarket, Teradata, SQL Server, Oracle, Excel and so on. Microsoft are already Tableau users, and cite Tableau as a success story on as an Office Business Application. In turn, Tableau focused on being able to integrate with Office early on in the development stage.
With BI moving into the realms of self-service, Microsoft are releasing new self-service software in the next release of SQL Server, codenamed Denali.  Known as Project Crescent, it is browser-based data visualisation. The keywords with Project Crescent are simplicity and self-reliance for business users; away from IT, connecting to source data via Sharepoint and the BISM, which is a prepared semantic model. Project Crescent supports a variety of data visualisations, in addition to tables and grids.
To be honest, I have wondered that Microsoft might be interested in Tableau, which are currently, at heart, an R&D organisation. However, that was until I watched Amir Netz’s video on Project Crescent on YouTube, and read Paul Turley’s Blog on moves into data visualisation by Microsoft. I watched this with mixed emotions, as a Tableau fan and a Microsoft fan; put another way, my Everett ‘many worlds’ were going to collide at some point, and this may well be it.
As someone who’s a BI obsessive, I’m delighted that Microsoft are going to please business users with this move. It’s smart, since business users tend to be the ones with the budgets and sign-off to purchase software; not the guys who write MDX in the basement. Further, since it is Sharepoint-based, it fits nicely with the roadmap. I’m looking forward to using it, and will gravitate towards using it when I get a hold of it.
As a Tableau fan, however, I’m utterly and completely torn. Some of you may know that I run the UK Tableau User Group along with Andy Cotgreave and Flying Binary, whose enthusiasm and support for Tableau have really helped it to take off in the UK. The main reason I love Tableau is that I get great satisfaction from watching business users get a grasp of their data, and take the pressure of the IT guys; and most of all, it’s fun. Now that Project Crescent is the new kid in the data visualisation block, what’s the future for Tableau? Here are some key differentiators, from my perspective:
  • Project Crescent is Sharepoint and BISM-based; Tableau can connect to pretty much anything, so if you are not lucky enough to have Sharepoint, then Tableau is for you. 
  • Price – who knows?
  • Best practices in data visualisation. Tableau adheres to a ‘knowledge base’ of data visualisations to best support the data, in line with the latest science of data visualisation. Excel/SSRS let you present the data how you want. I suspect Project Crescent will give the users what they want; to present the data how they want it, not perhaps how it should be done.
  • Tableau need to connect to the BISM. Fast.

The Tableau ‘sell’ is also partly community buzz; so that needs to be encouraged and nurtured; their target audience are also the Project Crescent audience, and they have some way of a head start. From my perspective, I’ve met some great people through Tableau, which makes the ‘late nights/early mornings’ time spent in taking care of UKTUG worthwhile. 
As for the future? I really do believe that Denali will be a huge success, and I’m behind Project Crescent since it gives business users what they want; to help themselves. In spite of this, in my opinion, the future is still bright for Tableau, who have just released Version 6, which is perhaps eclipsed by the fanfare surrounding Denali, but there was still a huge turnout at the various Tours around Europe. 
From my perspective, I hope to deliver solutions in both technologies when Project Crescent is released, using my Tableau experience to spur forward in Project Crescent, and feedback into the Tableausphere again.  Given the Silverlight development that’s gone into Project Crescent, however, I don’t think that Microsoft will be forking out for Tableau anytime soon.

Does Microsoft SQL Server Denali put us in denial about the ODS?

Microsoft have provided a first taste of the latest version of SQL Server, codenamed “Denali”. Denali focuses on mission-critical systems, scaleability, multi-site clustering and high availability, with facilities for data integration and data management tools to reliably deliver robust data. Denali also offers advanced data visualisation, supported by column-based accelerations in query performance. Given these technological advancements in database management in Microsoft (as well as other technologies), what’s the future of the ODS? 
Historically, mainframes could not offer quick reporting, and the data warehouse was subject-oriented. Thus, another reporting provision mechanism was required, that was situated between the source transactional systems and the final data warehouse. Data warehouses caused issues for database management, since they were required to balance small queries with large queries involving millions of records. This reporting needed to be abstracted away from the silo-based source systems, so integration was required; thus, the ODS was born. 
There are different methodologies surrounding the architecture of an ODS. Inmon defines the ODS as an integrated source of data, which offers an ‘as-is’ status of the business, holding only the most recent transactions. ‘Recent’ does not mean data that is only a few days old; the ODS may be feeding into a summarised data warehouse, and may need a full month’s data in order to come up with the final summary monthly data item required for the summary data warehouse. On the other hand, Kimball defines the ODS as the structure that stores detailed transaction data, which is not present in the data warehouse. However, as technological advances are made in database engines such as SQL Server, the transaction detail can be stored in the data warehouse, and can be interrogated by business users. This means that the place of the ODS can look uncertain.
In my experience, I’ve found that operational source systems are becoming more sensitive to the need for reporting. One example here is the Cisco Contact Centre call management database, which is based on SQL Server, which feeds into a dedicated reporting system. If Operational data sources are becoming more accommodating to serving reporting architectures, and technological advancements have been made in database management, what is the future for the ODS? Particularly, with the advent of more streamlined ETL applications such as SSIS offering lower-latency data delivery to users.
So where does the ODS fit in? Before the detailed transactional data can be placed into the data warehouse, it needs to be rationalised and integrated. Whilst stored in the ODS, the current operational data can be cleansed, integrated from its source systems, and steps can be taken towards redundancy resolution. Further, data in the ODS can be investigated for compliance with business rules.
However, an ODS should not, as Jill Dyche calls it, ‘compost’ data. In other words, the data sources have been pretty much copied into a data store with no concerted effort at integrating the data. Thus, you could have ten distinct ‘customer’ tables, from different sources, and have these ten tables copied directly into the data store with minimal – if any – integration taking place. This type of scenario is more akin to a staging area, than a dedicated ODS. However, here, the ODS starts to look like a data warehouse, since the ODS also has complex multiple data sources and dedicated to a number of subject areas. 
Instead, the ODS should be viewed as an interim logical step towards the data warehouse. At first glance, it may seem that the only difference between the ODS and the data warehouse is that the ODS does not store history, whereas the data warehouse does store history. This is not necessarily the case. The key to the ODS is current operational data. The ODS could potentially provide rapid access to current operational data, whereas the enterprise data warehouse may not get refreshed until overnight. How much ‘history’ is stored depends on user requirements, although I suspect that most business users would say ‘we need the data right now, and we want to keep it forever’ if they were consulted! This means that data could be interrogated for analysis and reporting purposes whilst the business operations are ongoing, and before the data is transferred to the data warehouse for longer-term reporting, data mining, or perhaps archiving.
The ODS can potentially be used to farm data back to source systems. However, it is optimised by having a Master Data Management system for as a robust library of reference data. Although Master Data Management has been around for some time, it has become more relevant with its specific inclusion with SQL Server 2008 R2. MDM is concerned with data quality and consistency, and the reconciliation of any data issues. Specific MDM implementations can depend on the environment. For example, the ODS and data warehouse can be supported by a Customer Data Integration module, or CDI, which is a ‘golden’ source of the enterprise customer data. The CDI can be viewed as a library or ‘hub’ for faciliating the consolidation and provision of good quality customer data across the enterprise, coming from various operational sources. An ODS and a CDI are complementary to one another; the CDI can have dedicated algorithms to perfect the customer data, the result of which can be served to the ODS. The CDI is not a facility for serving up reporting; this is the job of the ODS and the data warehouse. Further, the ODS should be viewed as a facility for integrating sources of different types of current operational data across the enterprise, but the CDI could be viewed as a nimble younger sibling of MDM, dedicated to customer information only.
With MDM, and potentially CDI in place, the ODS can offer data up more quickly since it has a good basis to offer up data, since the integration can happen more quickly. If the CDI and MDM are properly architected, then their offerings of good quality integrated data can result in low latency data served to the ODS, thus accelerating the process of provisioning data to users. 
To summarise, what’s the future of the ODS? With the advent of Denali, it is hoped that an optimised user experience is available to all, from DBA right through to operational users supported by MDM and data warehouse, with the ODS in a clearly-defined role. The key to an ODS is in integration; a well-integrated, current operational data store, where data is cleansed, governed and made compliant before it enters a data warehouse. If it isn’t integrated, its shelf-life is most likely minimal since the business users won’t be served properly, and may start to export data into Excel for their own slice and dicing. The ODS needs to be carefully considered, along with user requirements; if not, it is in danger of becoming a ‘political football’, and the business questions and needs ultimately not being answered.

SSRS & Excel; Visualising Large Datasets

What does Microsoft Reporting Services give you, that Excel doesn’t? Excel offers accessibility of data, and is, in my experience, the most popular BI tool that I encounter. On the other hand, Reporting Services is less accessible for people who don’t have strong Excel skills, but offers you a great deal of fine-grained control over the reporting output.

Let’s take an example, created in SSRS, which shows a line graph with 500+ rows. In case you are interested, this line graph shows the results of an analysis of the breakdown of samples of olive oil, the thesis from the team at Iowa State University being that olive oil differs in its constituents depending on the geographical part of Italy where the olives come from. 

SSRSParallelCoOrdinates

We can see here, from this sample of nearly 5000 rows, that the olives from the South Apulia area are higher in oleic acid; the rows are distinguished by their colour e.g. ‘South – South Apulia’ rows are marked by brown. SSRS was just made for dealing with large data sets, and turning them into visualisations.

Here is a subset of the total data, shown in Excel 2010, as follows:

ExcelParallelCoOrdinates

Excel 2010, in its plainest “non-PowerPivot” form, struggled to show the 500+ different lines on the line graph. We know that PowerPivot is able to deal with millions of rows of data, but it wasn’t used here; instead, I stayed with the scenario of being a user, who only has Excel installed.

Thus, it was more difficult to get Excel to display hundreds and hundreds of lines of data as in a ‘stress test’ scenario. 

The graph looks different since it is simply a subset of the total 500+ rows of data in the original sample. 

We still get an idea of the overall pattern, but the detailed, clean look of the SSRS report is better for displaying the whole story, in addition to displaying all of the data. The finer control, afforded by SSRS, was a powerful tool which helped to produce a clean, visual report which displayed all of the data from a larger dataset.

But that’s ok: SSRS and PowerPivot, mobilised by SQL Server and Sharepoint, are there for the purposes of helping users to mobilise large datasets, and help users to make sense of their data. Excel, on its own, offers great accessibility; and who doesn’t love Excel?

Add to Technorati Favorites

SSRS and Excel: Business Intelligence is more than just producing reports!

One of the key issues in Business Intelligence is the display of data concisely, appropriately and correctly. In particular, Reporting Services (SSRS) and Excel, in the wrong hands, can lead to incorrect data visualisations since it’s very easy to produce charts, graphs and so on, without thinking about the data in question, and the business question that you’re trying to answer. Both SSRS and Excel is that they allow users a generous amount of control over the report itself. In particular, SSRS provides so many controls and options in the ‘Properties’ window of each element of the report that it might be overwhelming for novice users. However, Excel’s ‘ease-of-use’ and SSRS ‘fine-grained control over reports’ can lead to the production of visual displays which are not suited to the data in question, or in some cases, just wrong! Thus, reporting does not always constitute a part of business intelligence; if it does not help to move the business forward, then it can’t be said to be real business intelligence.

Here is an example from my presentation at SQLBits 7, which is a Reporting Services chart which shows the top 5 most frequently appearing Doctor Who Villains since the series’ inception in 1963:

DoctorWhoLineGraph

Those of you who have had lots of caffeine will notice that the line graph is not a suitable method of displaying this data. Line Graphs are better for displaying continuous data, not distinct categorical data. If you are a fan of the Doctor Who series, you will know that the Daleks and the Cybermen are separate types of villains. If you are a real Doctor Who obsessive, then you will be wondering whether the count of Ice Warriors excludes the number of episodes where the Ice Warriors appeared in a benevolent capacity; this graph is specifically aimed at displaying a count of appearances rather than an analysis of the degree of ‘villain-ness’.

Since the villains are constituted of distinct categories, then the same data would be better placed as a bar chart in an SSRS report, which you can see below:

DoctorWhoBarChart

This SSRS column chart report is better because it emphasises the distinct categories being measured. Further, the ‘left to right’ reading methodology is easier for us here in the West. Also, it facilitates cognitive integration of the chart itself, by allowing sorting and easy comparison. This isn’t so straightforward in the line graph example given earlier. So, even though line graphs and column charts are very simple displays, it is easy to use them incorrectly.

To summarise, it is very easy to do visualisations which are not being representative of the data sample. Of course, the users have an option to amend the underlying data structure that they are looking for, so that they can obtain the particular visualisation that they would like to see. However, SSRS and Excel, in the wrong hands, can lead to reports that simply do not answer the business question, or help the business users to drive the business forward; that’s the heart of business intelligence.

Add to Technorati Favorites