Intelligent Laziness in Analysis Services 2008: Working Smart rather than Working Harder!

Further to my earlier theme of Intelligent Laziness, here is some key advice before setting out on the journey of creating Analysis Services cubes. Hopefully this advice will help you to ‘Work Smart’ rather than ‘Work Hard’!

Take the users along with you on the journey
If the end users don’t understand the cube, then they won’t like it; end result, they won’t use the cube. You could work iteratively so that you have a ‘working’ cube for users, and a development cube, so you can comfortably develop in a way that is insulated from users.

Plan, Plan, Plan!
It is important to plan early on in the process about your key success criteria for delivery. It’s easy to be swayed by enthusiastic users to include work that isn’t essential to the project.

Choose the correct version of Analysis Services
It sounds simple, but check you have the correct version of SQL Server that you need. For example, in SQL Server 2008 enterprise edition is aimed at large cubes which require partitioning. Motto: if in doubt, choose Enterprise.

End users querying the cube
How are the end users going to query the cube? There are lots of products which can allow users to interact with the cube, for example, Reporting Services is the obvious solution for MDX-savvy users. If the users are Excel power-users, then they might find XlCubed useful addition. If the users want data visualisation par excellence, then Tableau might be worth a look.

Letting Users know what’s in the Cube
It can be difficult to know how to best document the cubes. Try and reduce the amount of time to respond to user enquiries by relying on a mapping spreadsheet, which traces the lineage of the data from source to final cube measure.

Data Sources
It’s always worth checking how complete the data sources are. The SSAS cube depends on a well-formed data warehouse, and if this isn’t in place, then it is difficult to generate a cube from a source which it wasn’t designed to handle.

Dates
Users need dates; lots of them, and lots of different types e.g. Fiscal calendars, public holiday calendars, fiscal calendars for their business customers and so on. It is essential to work out what dates they need, and, importantly, who is responsible for generating the calendars.

Calculations
The users can be involved here; ask them to give you their most popular calculations. It is useful to use a mapping spreadsheet to record this information.

Cube Hierarchies
The users can also create the hierarchy structure within the cube, by using sample data as a base. Some hierarchies are straightforward, such as geography. Product hierarchies, for example, may not be very straightforward, and it is easy to make assumptions which are not correct.

Security
It’s better to get the security resources on board earlier than later on; it is essential to know who can access what calculations, and it’s important to know where the restrictions should be placed, sometimes for legal reasons.

Thanks for reading and hopefully this will save you some time and effort!

Using and Populating Tableau for UK Geographical Mappings

Tableau can use the longitude and latitude data as a basis for geographical mapping. Currently, it isn’t very straightforward to see how Tableau takes SQL Server 2008 spatial data types, because Tableau didn’t immediately recognise these types in some data I was looking at recently. However, for many people, using latitude and longitude is a natural method of identifying location, and this blog is dedicated to explain how to start to do this in Tableau, using SQL Server 2008 without the geographical data types. I’m working on some workarounds to use the new data types because I think it would be very useful, and the Tableau mapping functionality is very simple to use, and fast to produce quick results.

Here, we use the UK postcode to get the latitude and longitude. In case you haven’t thought about it, the UK postcode is broken up into two parts: an outward code, and an inward code. The outward code is the first part of the postcode before the space. The second half, or the inward code, is made up of a series of one number and two letters, which can be any letters of the alphabet except CIKMOV in case they are confused with numbers or other letters when handwritten.

In case you are interested, you also occasionally get ‘vanity codes’ as well, so that postcodes refer to an organisation. So, SW1A 0PW (House of Lords, Palace of Westminster); SW1A 2HQ (HM Treasury HeadQuarters); W1F 9DJ (Absolute Radio, Disc Jockey) and so on.

Note that, to get effective results in your Tableau worksheet, you have to convert Longitude and Latitude to Dimensions (right-click in Tableau, choose ‘convert to dimension’) in order to get accurate results. If you don’t do this, then you’ll be seeing these items as measures. When Tableau does its ‘best guess’ at displaying the data, Tableau will probably try to average these figures. This isn’t really what you want, since you can’t be sure that an average longitude or latitude will give you what you need.

The first step is to get a hold of some geographical mapping. For the UK, I downloaded a file containing 25000 postcodes with longitude and latitude information from the following location:
http://download.geonames.org/export/zip/ You can download the file called GB.Zip, and extract it somewhere to your hard drive. You’ll then need to import the data into your database. Here’s a quick script to create a table for you, create indexes, and then import the data:

CREATE TABLE Stage_PostCodesTemp(
[Country] [VARCHAR](2) NULL,
[PostCode] [VARCHAR](5) NULL,
[City] [VARCHAR](200) NULL,
[FullStateName] [VARCHAR](50) NULL,
[AbbreviatedStateCode] [VARCHAR](2) NULL,
[County] [VARCHAR](50) NULL,
[Unused1] [VARCHAR](5) NULL,
[Unused2] [VARCHAR](1) NULL,
[Latitude] [DECIMAL](8,5) NULL,
[Longitude] [DECIMAL](8,5) NULL,
[Unused3] [VARCHAR](1) NULL
)
GO
CREATE TABLE UK_PostCodes(
[PostCodeID] INT IDENTITY(1,1),
[Country] [VARCHAR](2) NULL,
[PostCode] [VARCHAR](5) NOT NULL,
[City] [VARCHAR](200) NULL,
[FullStateName] [VARCHAR](50) NULL,
[AbbreviatedStateCode] [VARCHAR](2) NULL,
[County] [VARCHAR](50) NULL,
[Latitude] [DECIMAL](8,5) NOT NULL,
[Longitude] [DECIMAL](8,5) NOT NULL,
[UKGeogColumn1] [GEOGRAPHY] NULL,
[UKGeogColumn2] [VARCHAR](100) NULL
)

–In future blogs, we’ll need the SQL Server 2008 data types.
–Spatial data types won’t work without a primary key,
–and a spatial index.
–So we add one here, for now.

ALTER TABLE UK_PostCodes ADD
CONSTRAINT [PK_UK_PostCodes] PRIMARY KEY CLUSTERED
(
PostCodeID
) WITH FILLFACTOR = 100

— create spatial index
CREATE SPATIAL INDEX SIndx_SpatialTable_geography_col1
ON UK_PostCodes(UKGeogColumn1);

— import the data
DECLARE @populate_postcodes VARCHAR(1000)
SET @populate_postcodes= ‘BULK INSERT Stage_PostCodesTemp
FROM ”\GB.txt”
WITH (FIELDTERMINATOR=”\t”, ROWTERMINATOR = ”’+CHAR(10)+”’)’
EXEC(@populate_postcodes)
We’ll need to clean up the data so that we don’t have any duplicates. To do this, you can use the following insert command to extract the data from your ‘GB’ table, remove duplicates, and insert it into your ‘UK_PostCodes’ table:

INSERT UK_PostCodes(Country, PostCode, City, FullStateName, AbbreviatedStateCode, County,Latitude,Longitude)
SELECT DISTINCT Country, PostCode, City, FullStateName, AbbreviatedStateCode, County, Latitude, Longitude
FROM Stage_PostCodesTemp
where [Longitude] is not null

So, now we have the data in the UK Postcodes table. Now, you will need to match data in this table with your existing data; perhaps a geography dimension, or something similar?
You’ll need to work out a way of marrying your own data to the postcodes provided in this table. Note that the table contains the ‘outcodes’ of the postcodes, so these are obviously multiples. I’ll leave that over to you 🙂

Add to Technorati Favorites