Five reasons to be excited about Microsoft Data Insights Summit!


I’m delighted to be speaking at Microsoft Data Summit! I’m pumped about my session, which focuses on Power BI for the CEO. I’m also super happy to be attending the Microsoft Data Summit for five top reasons (and others, but five is a nice number!). I’m excited about all of the Excel, Power BI, DAX and Data Science goodies. Here are some sample session titles:

Live Data Streaming in Power BI

Data Science for Analysts

What’s new in Excel

Embed R in Power BI

Spreadsheet Management and Compliance (It is a topic that keeps me up at night!)

Book an in-person appointment with a Microsoft expert with the online Schedule Builder. Bring your hard – or easy – questions! In itself, this is a real chance to speak to Microsoft directly and get expert, indepth  help from the team who make the software that you love.

Steven Levitt of Freakonomics is speaking and I’m delighted to hear him again. I’ve heard him present recently and he was very funny whilst also being insightful. I think you’ll enjoy his session. You’ll know him from Freakonomics.


I’m excited that James Phillips is delivering a keynote! I have had the pleasure of meeting him a few times and I am really excited about where James and the Power BI team have taken Power BI. I’m sure that there will be good things as they steam ahead, so James’ keynote is unmissable!

Alberto Cairo is presenting a keynote! Someone who always makes me sit up a bit straighter when they tweet is Alberto Cairo, and I’m delighted he’s attending. I hope I can get to meet him in person. Whether Alberto is tweeting about data visualisation, design or the world in general, it’s always insightful. I have his latest book and I hope I can ask him to sign it.


Tons of other great speakers! Now someone I haven’t seen for ages – too long in fact – is Rob Collie. Rob is President of PowerPivotPro and you simply have to hear him speak on the topic. He’s direct in explaining how things work, and you will learn from him. I’m glad to see Marco Russo is speaking and I love his sessions. In fact, at TechEd North America, I only got to see one session because I was so busy with presenting, booth duty etc… but I managed to get to see a session and I made sure it was Marco Russo and Alberto Ferrari’s session.  Chris Webb is also presenting and his sessions are always amazing. I have to credit Chris in part for where I am today, because his blog kept me sane and his generosity during sessions meant that I never felt stupid asking him questions. I’m learning too – always.

Ok, that’s five things but there are plenty more. Why not see for yourself?

Join me at the conference, June 12–13, 2017 in Seattle, WA — and be sure to sign up for your 1:1 session with a Microsoft expert.

PASS BA Speaker focus: What did Steve Jobs have to say about Daniel Fylstra?

PASS BA Conference is delighted to have Daniel Fylstra speaking at our conference. In fact, he was the first speaker we signed up.

Here’s what Steve Jobs had to say:

“There have been two real explosions that have propelled the industry forward. The first one really happened in 1977, and it was the spreadsheet. I remember when Dan Fylstra, who ran the company that marketed the first spreadsheet, walked into my office at Apple one day and pulled out this disk from his vest pocket and said “I have this incredible new program — I call it a Visual Calculator,” and it became VisiCalc. And that’s what really drove — propelled — the Apple II to the success it achieved.”

I think it’s great that we have such a visionary attending our PASS BA Conference. I’m looking forward to meeting Daniel, and picking up his insights from 30 plus years in the industry. Jobs mentions an event that took place when I was only six years old, and look at how the industry has grown since then. We have Excel’s 30th Birthday this year.

We can really say that the team of Dan Bricklin, Bob Frankston and Daniel Fylstra are great innovators who have really changed the industry fundamentally. Although they didn’t invent the spread sheet, it was Daniel Fylstra suggested it would be a viable product if it could run on an Apple II computer. VisiCalc was born.

Ever used the Microsoft Excel Solver? Well, here is Daniel Fylstra’s paper on the Microsoft Excel Solver, and its design and use. If you’ve ever used it, you should tip your hat to the great team who put it together. Daniel is one of those people. More recently, Fylstra has been working on Integrated simulation, data mining and optimization in Excel. Here’s a recent paper here, from the ACM.

From this insight, there are great innovators and thinkers that can we learn from them. They seek better ways of doing things, about bringing insights from other disciplines. We are inspired by people who see the bigger picture, and I’m personally looking forward to meeting Daniel and thanking him for all he has done for this industry.

Make sure you sign up for the Conference, and take the opportunity to meet a legend in the industry.

Day 6: The Data Analysts Toolkit: Why are Excel and R useful together, and how do we connect them?
Why is analytics interesting? Well, companies are starting to view it as profitable. For example, McKinsey showed analytics was worth 100Bn today, and estimated to be over 320Bn by 2020.
When I speak to customers, this is the ‘end goal’ – they want to use their data in order to analyse and predict what their customers are saying to them. However, it seems that folks can be a bit vague on what predictive modelling actually is.

I think that this is why Power BI and Excel are a good mix together. It makes concepts like Predictive Modelling accessible, after a bit of a learning curve. Excel is accessible and user-friendly, and we can enhance our stats delivery using R as well as Excel.

One area of interest is Predictive Modelling. This is the process of using a statistical or model to predict the value of a target variable. What does this actually mean?  Predictive modelling is where we work to the predict values in new data, rather than trying to explain an existing data set. To do this, we work with variables. By their nature, these vary; if they didn’t, they would be called a constant.

One pioneer was Francis Galton, who was a bit of an Indiana Jones in his day.  Although he wrote in the 19th century, his work is considered good and clear enough to read today. Therefore, this research has a long lineage, although it seems to be a new thing. We will start with the simplest: linear regression.

Linear regression compares two variables x and y to answer the question, “How does y change with x?” For predictive modelling, we start out with what are known as ‘predictor variables’; in terms of this question, this would be x. The result is called the target variable. In this question, this would be y. Why would we do this?

  • Machine Learning
  • Statistics
  • Programming with Software
  • Programming with Data 
  • Fun!

Why would businesses work with it at all?

  • to discover new knowledge and patterns in the data
  • to improve business results 
  • to deliver better customised services

If we have only one predictor variable and the response and the predictor variable have a linear relationship, the data can be analyzed with a simple linear model. When there is more than one predictor variable, we would use multiple regression. In this case, our question would be: , “How does y change with multiple x?” 

In fitting statistical models in which some variables are used to predict others, we want to find is that the x and y variables do not vary independently of each other, but that they tend to vary together. We hope to find that y is varying as a straight-line function of x.

If we were to visualise the data, we would hope to find a pleasing line chart which shows y and x  relating to each other in a straight line, with a minimal amount of ‘noise’ in the chart. Visualising the data means that the relationship is very clear; analysing the data means that the data itself is robust and it has been checked.

I think that’s why, in practice, Power BI, Excel and R work well together. R has got some great visualisations, but people are very comfortable with Excel for visualisations. All that loading packages stuff you have to do in R… it doesn’t work for everyone. So we use R and Excel, at a high level, as follows:

  • We cleanse and prepare data with Excel or Power Query
  • We use RODBC to load data into R
  • We analyse and verify the data in R
  • We build models in R
  • We load the data back into Excel using RODBC
  • We visualise the data for results

Excel is, after all, one of the world’s most successful software applications ever, with reputedly over one billion users. Using them both together means that you get the best of both words: R for analysis and model building: Excel is the ‘default’ for munging data around, and visualising it. I’m sure that one of the most popular buttons on software such as Tableau, QlikView et al is the ‘Export to Excel’ or ‘Export to CSV’ functionality. I’d be interested to know in what people think about that!

Building linear regression models in R is very simple; in our next session, we will look at how to do that, and then how to visualise it in Excel. Doing all this is easier than you think, and I will show you how.

The Data Analysts Toolkit Day 5: How do R and Power BI fit together?

How do R and Power BI fit together?

Technically, it is about munging data around between R and Excel and the Power BI components. 
You can use RODBC to connect to data between R and SQL Server, or R and Excel. Alternatively you can import data in.
Why else might you use R?

  • Pivot Tables are not always enough
  • Scaling Data (ScaleR)
  • R is very good at static data visualisation but Power BI and Excel are very good at dynamic data visualisation
  • You want to double check your results or do further analysis
They complement one another; they do not replace one another.
You may have heard my story about one organisation calculating the median incorrectly. 
The truth is, people don’t often check their data. I help design data warehouses all the time, and I don’t always hear people talk about reconciliation. I do hear about people racking up SSAS and diving straight in.
Just because something works technically, does not mean it is correct.
Upworthy and Facebook use R. A lot. So why not you? It is achievable.
Why R, and not some other package?
  • R most widely used data analysis software – used by 2M + data scientist, statisticians and analysts
  • Most powerful statistical programming language
  • used with RStudio, it can help you for the purposes of productivity
  • Create beautiful and unique data visualisations – as seen in New York Times, Twitter and Flowing Data
  • Thriving open-source community – leading edge of analytics research
  • Fills the talent gap – new graduates prefer R.
  • It’s fun!
Excel is used by an estimated 1.3 billion people on the planet. That sounds really impressive, until you think that many people are often using it wrong!
R just helps you to do that double check, the sanity check, to see if your data is correct.

Democratization of Data: From Ideas to Decisions with Power BI

“Don’t worry about people stealing an idea. If it’s original, you will have to ram it down their throats.” Howard Aiken, Founder of Harvard’s Computing Science Program.

Data is moving so fast these days, and there is a shift whereby people are paying for value, not technology. This is where cloud computing comes in: it is very empowering, because anyone with an internet connection can access it. With Power BI in the cloud, small businesses are liberated with the ability to use the same tools and techniques to explore ideas as larger organisations.

In this session, we will look at understanding the Power BI components and tools available in the cloud, including the Power BI Admin Center, Power Query, Power Pivot, Power View and Power Map. We will look at how to use them will accelerate ideas and help to clarify decisions, and related to this, discuss the roles within IT and the business in relation to these tools. We will also look at business puzzles versus business mysteries, a definition evoked by Malcolm Gladwell (Blink, Outliers) in relation to Power BI.

“Out there in some garage is an entrepreneur who’s forging a bullet with your company’s name on it,” said Gary Hamel, a management guru. With Power BI, let’s see how you can translate your ideas in to a message that people can see, using cloud as an empowerment tool.


How do you choose the right data visualisation in Power BI to show your data?

How do you choose the right visualisation to show your data? Usually the customer wants one thing, the business user want something else, the business sponsor wants something flashy…. and it’s hard to tease out the requirements, and that’s before you’ve even opened up Power BI such as Power View, Excel, Tableau or whatever your preferred data visualisation software.

In other words, there are simply too many charts to choose from, and too many requirements to meet. Where do you start?

I found this fantastic diagram which can help you to choose the right visualisation. I’m often surprised to see that people haven’t seen this before. Note: this diagram was done by Andrew Abela of Extreme Presentation and the source is here and his email address is on the slide, so be sure to thank him if you’ve found it useful. If you can’t see it very well, click here to go to the source.


Chart Choosers should not replace common sense, however, and Naomi Robbins has written a nice piece here which is aimed at the wary. However, diagrams like Abela’s can really help a novice to get started, and for that, I’d like to thank him for his work.

How does it related to Microsoft’s Power BI? If you look at the visualisations that are available in Power View, you can see that most of the visualisations in the diagram are available in Power BI.  The ones that are excluded are the 3D graphs, circular area charts, variable width charts, or the waterfall chart.

Why no 3D? I personally hope that Microsoft will leave 3D out of Power BI tools, unless of course it is in Power Map.  With 3D on a chart, it is harder to identify the endpoints, and it can take us longer. It might also mean that points are occluded. If you’re interested and want to see examples, here is one by the Consultant Journal team or you can go ahead and read Stephen Few’s work. If you haven’t read anything by Stephen Few, get yourself over to his site right now. You won’t regret it. Why is it different from Power Map? 3D maps provide context, and they are the exception where I will use 3D for a data visualisation showing business data. I’m obviously excluding other types of non-business data here, such as medical imaging and so on.

Why no circular area or variable width charts? I am not a fan of variable width of circular area because we aren’t very good at evaluating area when we look at charts and graphs, and Robert Kosara has an old-but-good post on this topic here.

This blog is mainly for me to remember stuff but I hope it helps someone out there too.

Best Wishes,

Eating the Elephant: Totally free videos showing an introduction to Visualising Big Data for Business Intelligence Professionals

Continuing my ‘eating the Elephant, one bite at a time’ series, which focuses on Microsoft Business Intelligence and Hadoop, I’ve put together a series of totally free videos, to help people who are interested in visualising Big Data using familiar tools in Microsoft. The purpose is to take data from various data sources, including SQL Server, HDInsight (Microsoft’s distro of Hadoop) and Excel, and visualise the data via PowerPivot and Excel.

Self Service Bi – and Big Data – a Business Intelligence person’s dream! Well, it is for me!

As I say in these videos, Excel is the world’s favourite software for Business Intelligence, and it must surely rank as one of the most favourite software applications of all time. Excel is used (and abused!) more than any other software I’ve seen.

I hope that you will enjoy the videos and I look forward to your feedback. You can access them on YouTube here

Please note: in the process of practising for my Big Data precon at SQLPass Summit in Charlotte on 15th October, I reused the material from a fantastic blog post by Cindy Gross of the SQLCat team and I’d like to thank Cindy and her team for writing this material.

The blog post is here and I recommend that you go through it – I just videoed it, but the material belongs to them so I’d like to make sure that they get credit for the blog post, so that’s why I’m emphasising that I’m calling it out.

Please note that this isn’t material from my actual precon – it’s simply a way of me to work my way through preparing for the precon I’m presenting jointly with Allan Mitchell (SQL Server MVP). I have simply put it in video format in order to practice my delivery, and then it struck me that people might find this useful. If so, look out for more videos in future!

I hope it helps.
Kind Regards,