What’s next after Azure DataMarket?

Microsoft are retiring Azure DataMarket due to a lack of sustained customer interest in DataMarket. Why is that, and where do we go from here?

High Hopes


Credit: Stocksnap.io

I hoped the Azure DataMarket might become a cloud-based Master Data Solution for people building data warehouses. So, for example, people would download a Geography dimension, or a Date Dimension, which was pretty robust and clean. Users could then change it to meet their needs, and it would form a ‘Master’ within their organisation. Your single version of the truth.



If your data is in silos, so is your analytics

Organisations don’t seem to consider Master Data Management very much. What is Master Data Management, anyway? Master data management (MDM) provides a trusted view of critical business entities, in data, which are stored, probably duplicated, in siloed applications – customers, suppliers, partners, products, materials, accounts, etc.
Master Data Management (MDM) solutions provide a simple and trusted view of your data to allow you to think horizontally about your data, across your business. This is in contrast with thinking vertically about your data, in data silos which usually match LOBs. Horizontal and vertical thinking helps achieve customer centric objectives and business results.

Wrong Problem, Wrong Audience


Credit: Stocksnap.io

Why did Azure DataMarket fail? It seemed to be aimed primarily at IT people, not business people. Now, IT folks are usually pretty good at searching. A contact of mine bemoaned a Kibana implementation I’d done for his organisation, stating that I’d taken grep away from him and he’d worked so hard to teach everyone grep for the last two years and now it was all visual fluffy stuff. (Suck it up, princess. Kibana rocks and everyone is using it.).

Business people don’t rock grep, regex or any other greppy like tool. They want easy-to-use, easy-to-find solutions.

A Solution Looking for a Problem

Let’s hope Microsoft will do a DataMarket again, which aimed at the right business people to find things quickly. I don’t want something that is a solution looking for a problem.

The right problem: make search easy for the folks who need it most. Not for the IT folks, but the less well defined Information Worker who actually struggles with data. Let them access it very simply, and soon you’ll have Azure data being mashed with other data sources and it’s a step in the journey of getting users familiar with Azure.

And if we can get folks to think about proper Master Data Management based in the cloud, that’s easy to navigate, then that’s got to be a good thing.

Importing Google Spreadsheets into Windows Azure Data Explorer

Hi! This blog will take you through the steps of importing a Google spreadsheet into Microsoft Azure Data Explorer. You could then play with this data by using data from the Windows Data Market. I think I love the Data Explorer so much because it allows a nice, easy format for mixing up data from different sources.  This activity takes the form of two steps: ensuring that the Google spreadsheet is published, and then importing it into Data Explorer.

The Google spreadsheet came from the fantastic Guardian Datablog, and it focuses on the New Years Honours list for 2012. In case you’re not British or from the Commonwealth, and wondering what I’m blathering on about, the New Years Honours List is a quaint British tradition which recognises outstanding achievement to people who serve their communities.  The original Guardian commentary can be found here.

The Google spreadsheet obviously belongs to them, so I needed to take a copy of it, and publish it to my own Google account. To do this, you click on ‘File’ and then ‘Publish to the web’ It is very straightforward to do this, but if you need an image, click here.

You then need to make sure that you publish the spreadsheet as a CSV format. This is quicker and easier for importing. You can see an example of this below, or if you need the original image, you can find it here:

ii Google Publish to the web

The other item to note is that you should just select one sheet, and not ‘All Sheets’. In doing so, you’re making the data easier to import. Here, the sheet is called ‘Full List’.

Once you’ve selected the sheet, you should copy the link that appears in the box. You’ll need this in order to import the data. I didn’t import this file as ‘web content’ – instead, I did ‘Import File’ and then copied the link into the ‘Open’ Dialog box. This imported the file as text. You then get the following options:

a Open as CSV

Upon importing the data as a table, you might find that you get the following error message:

The CSV input has rows with varying numbers of columns, and the first row does not have enough columns to specify the input width for all rows. Specify a value for the ColumnCount option to prescribe the number of columns to include in the output. Here is an example of the image here:

b Open as CSV error

If this is the case, then it is perhaps easier to import the data as text in order to examine more clearly to see if there are issues. To do this, just open the ‘Text’ box at the top left hand side, or right-click and select ‘Open as Text’. Here is an example:

c open as Text

Once you’ve converted it to text, the Data Explorer screen will appear as follows. You can click on the image or click here for a larger image:

d result when opened to text

Now, it is a bit clearer to see the status of the data since we can see the text. The next step is to try and re-import it as a table.  You can do this by clicking ‘Table’ at the top left hand side. You’ll now get the following screen:

e Comma delimiter settings and skip first line

And another thing…

If you’re sharp eyed, you’ll have noticed an anomaly on line 10. Why does Herbert Douglas have the title ‘Charity’? Shouldn’t this be in ‘For Services to’ column? If you look back at the first picture of this blog, you’ll see that the same thing happens in the Google spreadsheet; since it matches the source, then this isn’t an issue with the data import, but the raw data. You can see a snapshot of the raw data below and if you need a larger image, click here.

i Google publish to the web

Don’t be fooled by the nice clean appearance of the Data Explorer – there are lots of customisation and nice things that can be done with the Data Explorer, so let’s look forward to more fun in future blogs!

Don’t you just love data? I do! I hope that this helps.

Windows Azure Marketplace – what data sources would you like to see?

During my presentations at SQLBits, SQLRelay and other UK User Group meetings, I have been dismayed by the lack of awareness of the Windows Azure Marketplace. This blog aims to explore some of the reasons that this may be happening, and I’d also like to canvass you, dear reader, so you can highlight the data sources that you would like to have in the Datamarket.
First of all, the Windows Azure Datamarket is not to be confused with the Datamarket, which is a company based in Iceland which sounds similar. The Windows Azure Datamarket is a broad reaching collection of subscription-based data services, including applications and a variety of data for consumers and businesses to utilise. It is available in 26 countries, as at the time of writing in October 2011. It is a marketplace in the sense that it is possible to purchase and sell data and applications. The types of data available include financial, property, geographical data, retail data and even fun sports data. The data from the Windows Azure Marketplace can be consumed by Excel, Tableau and Visual Studio.
One intention of the Windows Azure Marketplace is that it will support business analysts everywhere, in their quest for clean, up-to-date data. I believe it is potentially a very powerful source of data for enterprises. For example, by provisioning clean, “looked after”, up-to-date datasets, it can reduce the amount of effort in looking after external data. In other words, companies who already ‘clean up’ external data sets might look to the Windows Azure Marketplace in order to see if there are existing datasets that could be rented. It’s the old problem of ‘outsource or internal spend’ – but at least it is good to have options to explore.
So, given the potential for the Windows Azure Marketplace as a potential data store, why the lack of awareness or uptake? Out of my recent travels to various User Groups, SQLBits and so on, hardly anybody had heard of it, never mind actually used it in production.  I am guessing that one reason for this is that the data stores aren’t plentiful with UK-focused datastores.  My research showed that there were a number of UK data sources available. These included:
In other words, not very many sources! My search was hampered for the fact that the search string must contain at least three characters. Therefore, if you are searching for ‘UK’ then you are stuffed! I am guessing that the uptake isn’t very strong since the UK-focused data needs to be grown. In my opinion, I guess that this will happen over time.  Since there is an Excel add-in for the Marketplace, the route to uptake of this service is clear. I think that this will take time, and it is potentially a very powerful tool for analysts and researchers.
Hence this blog: I am wondering what UK data sources you would like to see? Here is my list of free data sources that I’d love to see on the Marketplace as a one-stop-shop for data requirements:
The Guardian Datastore – basically anything that they produce. Love it!
UK Census data – since the next Census is out soon in the UK, it would be particularly relevant to have this information
The Data Archive – Social Sciences and Humanities data for the UK. Not as esoteric as they might sound since they also discuss the future of data sources. This is a reflective data store, and I’d recommend that you take a look at it.
Health and Safety Executive Data – Risk Control, Public health and comparison with other European countries
Heidi – I have never been able to access this, but it is available to Education planners. 
The Treasury also offer UK data on finance and key financial indicators
The Bank of England offers a wealth of financial data, focused on the UK
Office for National Statistics – data on agriculture, children, economy, government, travel… you name it!
If you can think of any other data sources you would like to see on the Windows Azure Datamarket, then please leave a comment. I’d love to hear from you and you’d also satisfy my never-ending thirst for more data sources!