PASSBAC keynote: The Microsoft data story, and the next chapters

I attended the keynote from the PASS Business Analytics Conference in Chicago, April 2013. For those of you who missed it, here is some of the content of the keynote.
The take away point is that Business Intelligence must be simple. It is important to make it fun, and we are drowning in data. Not being able to read and understand data, is like being disadvantaged in today’s world. We have to move beyond what we now think about Business Intelligence. We have to get inside our data. 

The keynote was opened by Bill Graziano ( twitter ¦ website ) who underlined the importance for a need amongst Business Analytics professionals for knowledge and support. PASS is helping to formulate a community of Data Professionals. If you’re a Business Analytics  professional, you can be a part of the shaping of this community and be a part in it. If this interests you, could can take a look at joining up a Virtual Chapter and receive monthly webcasts, for example. This can help you to stay connected once PASSBAC is finished.
Next, we had Dell appearing. They have over 15 years of IM software experience. 

Dell has over 15 years of IM software experience, after their purchase of @Quest, for example. Dell constantly monitor their own brand in social media. For example, they have 6 years experience in watching their brand online, and anticipating with customers from a support and brand engagement perspective. They made the following observations in the social media market:

Dell’s Observations
Data Type Proliferation
Vendor Proliferation
Data-Location proliferation
IT and LOB challenges
Snap into existing environments

Given these observations, they then moved to address these points in the market:

Analysis of social media needed to be:
Data-Type Agnostic
Vendor Agnostic
Data-Location Agnostic
Capability needs to be at the tools layer

The takeaway point from the Dell part of the keynote is: Make the hard things simple to allow for more collaboration, exploration, analysis and communication
The final part of the keynote was presented by Amir Netz, who is a Distinguished Technical Fellow at Microsoft. The few times I’ve been lucky enough to speak with Amir in person, I’ve found him to be a very approachable and fun guy, and this came across clearly in the keynote, which was probably the most engaging I’ve seen (and I see a lot!) Netz was accompanied by Kamil Hathi who knows his stuff inside out and is a ‘go to’ expert for Analysis Services. I was really excited about this keynote since, whilst individually they are both excellent speakers, the idea of a joint presentation sounded fun and informative. They started off by emphasising how ‘simple’ attracts people. The strategy is to go back to Excel, thereby capturing the simplicity once again.  
How can we make the spreadsheet really interesting again? Well, we can add in lots of unstructured data! To do this, we can use Hadoop, which is a file system, essentially a shoebox of unstructured data. A lot of data, all different kinds! Using Hadoop, you can easily transform the unstructured data. Structure on extraction, rather than a structured data model before extraction. Therefore, you’re not imposing a structure on data that the ETL meets via the load; you’re structuring the data once it is in Hadoop.

The team then did a great demo of PowerView using data from a dataset of music and songs. We learned that Mariah Carey has had more weeks in the charts than luminaries such as Elvis, the Beatles and U2.
This showed the power of the interaction of the data. In Amir’s example, the kids had fun learning about their music idols. This was easily demonstrated in the keynote audience, who were cheerfully shouting out band names. We heard all sorts of names being shouted out: Willie Nelson, Johnny Cash, Madonna and even Milli Vanilli!

Power View is all about sharing information, and having fun with the data. 
Business Intelligence is elective – nobody forces an organisation to use Business Intelligence. It is about time management, and how much time people spend on Business Intelligence.
However, business intelligence can be fun can help you to get the ROI because it means that people will use it, and learn from it. Fun is important, and success is infectious.
It has to be more than fun, but it is a good starting place.  We can use it to start more investigations, and then lead to deeper questions.
The team then did a deeper exploration using Power View for sentiment analysis with Twitter data. Sentiment used as a means of prediction of outcomes. If you’d like to know more about this, I’ve written a two part MSDN article on the topic.

The most exciting part was the announcement of Codename Geoflow, which allows you to do location sensitive content to your data. In other words, it allows you to create 3D data visualisations based on maps. You have to see it to believe it, and if you’re looking for #Geoflow information, here you are … #PASSBAC #SQLBits #SQLPass #SQLFAQ
Amir did a great demo to show the changes in the ‘music chart songs’ data over time and over place. It is a wonderful story, brought to life by #Geoflow. it also looked great on the huge 81 inch touchscreen, and its a great way to drive visualisations of data. At the PASS BA Conference, we will be lucky enough to have the Microsoft Experience lounge, where we can go and try all of this gadgetry out! Like Amir says, it has to be fun too.

We don’t just think about business. Business Intelligence could also be called basic intelligence, but to achieve it, we need to get inside our data and let people work with it in familiar tools. 
This is the Microsoft story, and I’m excited to see the next chapter for our business users.

Importing Google Spreadsheets into Windows Azure Data Explorer

Hi! This blog will take you through the steps of importing a Google spreadsheet into Microsoft Azure Data Explorer. You could then play with this data by using data from the Windows Data Market. I think I love the Data Explorer so much because it allows a nice, easy format for mixing up data from different sources.  This activity takes the form of two steps: ensuring that the Google spreadsheet is published, and then importing it into Data Explorer.

The Google spreadsheet came from the fantastic Guardian Datablog, and it focuses on the New Years Honours list for 2012. In case you’re not British or from the Commonwealth, and wondering what I’m blathering on about, the New Years Honours List is a quaint British tradition which recognises outstanding achievement to people who serve their communities.  The original Guardian commentary can be found here.

The Google spreadsheet obviously belongs to them, so I needed to take a copy of it, and publish it to my own Google account. To do this, you click on ‘File’ and then ‘Publish to the web’ It is very straightforward to do this, but if you need an image, click here.

You then need to make sure that you publish the spreadsheet as a CSV format. This is quicker and easier for importing. You can see an example of this below, or if you need the original image, you can find it here:

ii Google Publish to the web

The other item to note is that you should just select one sheet, and not ‘All Sheets’. In doing so, you’re making the data easier to import. Here, the sheet is called ‘Full List’.

Once you’ve selected the sheet, you should copy the link that appears in the box. You’ll need this in order to import the data. I didn’t import this file as ‘web content’ – instead, I did ‘Import File’ and then copied the link into the ‘Open’ Dialog box. This imported the file as text. You then get the following options:

a Open as CSV

Upon importing the data as a table, you might find that you get the following error message:

The CSV input has rows with varying numbers of columns, and the first row does not have enough columns to specify the input width for all rows. Specify a value for the ColumnCount option to prescribe the number of columns to include in the output. Here is an example of the image here:

b Open as CSV error

If this is the case, then it is perhaps easier to import the data as text in order to examine more clearly to see if there are issues. To do this, just open the ‘Text’ box at the top left hand side, or right-click and select ‘Open as Text’. Here is an example:

c open as Text

Once you’ve converted it to text, the Data Explorer screen will appear as follows. You can click on the image or click here for a larger image:

d result when opened to text

Now, it is a bit clearer to see the status of the data since we can see the text. The next step is to try and re-import it as a table.  You can do this by clicking ‘Table’ at the top left hand side. You’ll now get the following screen:

e Comma delimiter settings and skip first line

And another thing…

If you’re sharp eyed, you’ll have noticed an anomaly on line 10. Why does Herbert Douglas have the title ‘Charity’? Shouldn’t this be in ‘For Services to’ column? If you look back at the first picture of this blog, you’ll see that the same thing happens in the Google spreadsheet; since it matches the source, then this isn’t an issue with the data import, but the raw data. You can see a snapshot of the raw data below and if you need a larger image, click here.

i Google publish to the web

Don’t be fooled by the nice clean appearance of the Data Explorer – there are lots of customisation and nice things that can be done with the Data Explorer, so let’s look forward to more fun in future blogs!

Don’t you just love data? I do! I hope that this helps.