As a consultant, I get parachuted into difficult problems every day. Often, I figure it out because I have to, and I want to. Usually, nobody else can do it other than me – they are all keeping the fires lit. I get to do the thorny problems that get left burning quietly. I love the challenge of these successes!

How do you get started? The online and offline courses, books, MOOCs, papers, blogs and the forums help, of course. I regularly use several resources for learning but my number one source of learning is:

Doing the ‘do’ – working on practical projects, professional or private

Nothing beats hands-on experience. 

How do you get on the project ladder? Without experience, you can’t get started. So you end up in this difficult situation where you can’t get started, without experience.

Volunteer your time in the workplace – or out of it. It could be a professional project or your ‘data science citizen’ project that you care about. Your boss wants her data? Define the business need, and identify what she actually wants. If it helps, prototype to elicit the real need. Volunteer to try and get the data for her. Take a sample and just get started with descriptive statistics. Look at the simple things first.

Not sure of the business question? Try the AzureML Cheat Sheet for help.


Working with dat means that you will be challenged with real situations and you will read and learn more, because you have to do it in order to deliver.

In my latest AzureML course with Opsgility, I take this practical, business-centred approach for AzureML. I show you how to take data, difficult business questions and practical problems, and I show you how to create a successful outcome; even if that outcome is a failed model, it still makes you revise the fundamental business question. It’s a safe environment to get experience.

So, if this is you – what’s the sequence? There are a few sequences or frameworks to try:

  • TDSP (Microsoft)
  • KDD

The ‘headline’ of each framework is given below, as a reference point, so you can see for yourself that they are very different. The main thing is to simply get started.

Team Data Science Process (Microsoft)











It’s important not to get too wrapped up on comparing models; this could be analysis paralysis, and that’s not going to help.

I’d suggest you start with the TDSP because of the fine resources, and take it from there.

I’d be interested in your approaches, so please do leave comments below.

Good  luck!

See you at Techorama?


Why should you go to Techorama?

Techorama is a yearly international technology conference which takes place at Metropolis Antwerp. We welcome about 1500 attendees, a healthy mix between developers, IT Professionals, Data Professionals and SharePoint professionals.

I’m delighted to announce I’m speaking, and I’d like to take this opportunity to thank the Techorama team for all of their hard work and effort in putting on a great show.


First off, there will be a keynote by Scott Guthrie, EVP of Cloud + Enterprise, Microsoft Corporation – now this is BIG NEWS.

Scott Guthrie, EVP of Cloud + Enterprise, Microsoft Corporation will be keynoting at Techorama 2017 (May 23). In his keynote, “Azure, The Intelligent Cloud”, Scott will open the event with a strategic vision on the Microsoft cloud.

Scott Guthrie will also give another breakout session on May 23 which will be a Q & A session. Come with your questions!




The event itself will be top-notch content for Developers, IT Professionals, Data Professionals and SharePoint Professionals: 11 parallel breakout sessions with top speakers from all over the world: experts in their field, offering meaningful networking opportunities with partners and like-minded people

There will also be a unique conference experience in a movie theatre with lots of surprises!

What will I be talking about? You can find out more here at my dedicated Techorama page.

Data Visualisation Lies and How to Spot them

During the acrimonious US election, both sides used a combination of cherry-picked polls and misleading data visualization to paint different pictures with data. In this session, we will use a range of Microsoft Power BI and SSRS technologies in order to examine how people can mislead with data and how to fix it. We will also look at best practices with data visualisation. We will examine the data with Microsoft SSRS and Power BI so that you can see the differences and similarities in these reporting tools when selecting your own Data Visualisation toolkit.

Whether you are a Trump supporter, a Clinton supporter or you don’t really care, join this session to spot data lies better in order to make up your own mind.

We hope to welcome you at Techorama 2017!


Data Preparation in AzureML – Where and how?

messy-officeOne question that keeps popping up in  myc customer AzureML projects is ‘How do I conduct data preparation on my data?’ For example, how can we join the data, clean it, and shape it so that it is ready for analytics? Messy data is a problem for every organisation. If you don’t think it is an issue for your organisation, perhaps you haven’t looked hard enough.

To answer the question properly, we need to stand back a little, and see the problem as a part of a larger technology canvas. From the enterprise architecture perspective, that it is best to do data preparation as close to the source as possible. The reason for this is that the cleaned data would act as a good, consistent source for other systems, and you would only have to do it once. You have cleaned data that you can re-use, rather than re-do for every place where you need to use the data.

Let’s say you have a data source, and you want to expose the data in different technologies, such as Power BI, Excel and Tableau. Many organisations have a ‘cottage industry’ style of enterprise architecture, where they have different departments using different technologies. It is difficult to align data and analytics across the business, since the interpretation of the data may be implemented in a manner that is technology-specific rather than business-focused. If you take a ‘cottage industry’ approach, you would have to repeat your data preparation steps across different technologies.


When we come to AzureML, the data preparation perspective isn’t forgotten, but it isn’t a strong data preparation tool like Paxata or Datameer, for example. It’s the democratization of data for the masses, yes, and I see the value it brings to businesses. It’s meant for machine learning and data science, so you should expect to use it for those purposes. It’s not a standalone data preparation tool, although it does help you partway.

The data preparation facilities in AzureML can be found here. If you have to clean up the data in AzureML, my futurology ‘dream’ scenario for AzureML is that Microsoft have weighty data preparation as a task, like other tasks in AzureML. You could click on the task, and then have roll-your-own data preparation pop up in the browser (all browser based) provided by Microsoft or perhaps have Paxata or Datameer pop out as a service, hosted in Azure as part of your Azure portal services. Then, you would go back to AzureML, all in the browser. In the meantime, you would be better trying to follow the principles of cleaning it up close to the course.

crisp-dm_process_diagramDon’t be downhearted if AzureML isn’t giving you the data preparation that you need. Look back to the underlying data, and see what you can do. The answer might be as simple as writing a view in SQL Server. AzureML is for operations and machine learning further downstream. If you are having serious data preparation issues, then perhaps you are not ready for the modelling phase of CRISP-DM so you may want to take some time to think about those issues.

PASS Business Analytics Day, Jan 11, Chicago


PASS’ first Business Analytics Day, which will be held in Chicago on January 11, 2017. You can choose one of two full-day, in-depth sessions for $595: In-Database Analytics with R and SQL Server 2016 and Mastering Power BI Solutions.

These are unique learning opportunities to get more advanced in R or data visualization with Power BI. And as with other PASS events, the goal is to allow you to walk away with real-world analytics knowledge that you can use immediately!

PASS Business Analytics Day

You have two great choices: In-Database Analytics with R and SQL Server 2016 and Mastering Power BI Solutions.

In-Database Analytics with R and SQL Server 2016

With Microsoft SQL Server 2016, data scientists can run in-database analytics using R. This is a “best of both worlds” scenario: delegate database management to SQL Server whilst you create analytics and visualisations in R and Power BI. In this session, we will cover the overall architecture of SQL R Services and go over some best practices. We will look at best practices in analytics and visualisations with a focus on R, and then we delve more in-depth into some practical common use-cases.

David Smith, R Community Lead at Revolution Analytics, a Microsoft Company
Seth Mottaghinejad, Data Scientist, Microsoft

Mastering Power BI Solutions

In this Power BI hands-on Workshop, you will master the “power” of Power BI. Learn to use self-service and enterprise-scale Power BI capabilities; gain valuable skills to integrate, wrangle, shape and visualize data for analysis. Beginning and intermediate level users will learn to address data and reporting challenges with advanced design techniques.

Paul Turley, Mentor with SolidQ, BI Architect, and Microsoft Data Platform MVP

Date: January 11, 2017

Location: Microsoft Technology Center, #200 – 200 East Randolph Drive, Chicago, IL.

We hope you’ll join us!

Guess who is appearing in Joseph Sirosh’s PASS Keynote?

This girl! I am super excited and please allow me to have one little SQUUEEEEEEE! before I tell you what’s happening. Now, this is a lifetime achievement for me, and I cannot begin to tell you how absolutely and deeply honoured I am. I am still in shock!

I am working really hard on my demo and….. I am not going to tell you what it is. You’ll have to watch it. Ok, enough about me and all I’ll say is two things: it’s something that’s never been done at PASS Summit before and secondly, watch the keynote because there may be some discussion about….. I can’t tell you what… only that, it’s a must-watch, must-see, must do keynote event.

We are in a new world of Data and Joseph Sirosh and the team are leading the way. Watching the keynote will mean that you get the news as it happens, and it will help you to keep up with the changes. I do have some news about Dr David DeWitt’s Day Two keynote… so keep watching this space. Today I’d like to talk about the Day One keynote with the brilliant Joseph Sirosh, CVP of Microsoft’s Data Group.

Now, if you haven’t seen Joseph Sirosh present before, then you should. I’ve put some of his earlier sessions here and I recommend that you watch them.

Ignite Conference Session

MLDS Atlanta 2016 Keynote

I hear you asking… what am I doing in it? I’m keeping it a surprise! Well, if you read my earlier blog, you’ll know I transitioned from Artificial Intelligence into Business Intelligence and now I do a hybrid of AI and BI. As a Business Intelligence professional, my customers will ask me for advice when they can’t get the data that they want. Over the past few years, the ‘answer’ to their question has gone far, far beyond the usual on-premise SQL Server, Analysis Services, SSRS combo.

We are now in a new world of data. Join in the fun!

Customers sense that there is a new world of data. The ‘answer’ to the question Can you please help me with my data?‘ is complex, varied and it’s very much aimed at cost sensitivities, too. Often, customers struggle with data because they now have a Big Data problem, or a storage problem, or a data visualisation access problem. Azure is very neat because it can cope with all of these issues. Now, my projects are Business Intelligence and Business Analytics projects… but they are also ‘move data to the cloud’ projects in disguise, and that’s in response to the customer need. So if you are Business Intelligence professional, get enthusiastic about the cloud because it really empowers you with a new generation of exciting things you can do to please your users and data consumers.

As a BI or an analytics professional, cloud makes data more interesting and exciting. It means you can have a lot more data, in more shapes and sizes and access it in different ways. It also means that you can focus on what you are good at, and make your data estate even more interesting by augmenting it with cool features in Azure. For example, you could add in more exciting things such as Apache Tika library as a worker role in Azure to crack through PDFs and do interesting things with the data in there. If you bring it into SSIS, then you can tear it up and down again when you don’t need it.

I’d go as far as to say that, if you are in Business Intelligence at the moment, you will need to learn about cloud sooner or later. Eventually, you’re going to run into Big Data issues. Alternatively, your end consumers are going to want their data on a mobile device, and you will want easy solutions to deliver it to them. Customers are interested in analytics and the new world of data and you will need to hop on the Azure bus to be a part of it.

The truth is; Joseph Sirosh’s keynotes always contain amazing demos. (No pressure, Jen, no pressure….. ) Now, it’s important to note that these demos are not ‘smoke and mirrors’….

The future is here, now. You can have this technology too.

It doesn’t take much to get started, and it’s not too far removed from what you have in your organisation. AzureML and Power BI have literally hundreds of examples. I learned AzureML looking at the following book by Wee-Hyong Tok and others, so why not download a free book sample?

How do you proceed? Well, why not try a little homespun POC with some of your own data to learn about it, and then show your boss. I don’t know about you but I learn by breaking things, and I break things all the time when I’m  learning. You could download some Power BI workbooks, use the sample data and then try to recreate them, for example. Or, why not look at the community R Gallery and try to play with the scripts. you broke something? no problem! Just download a fresh copy and try again. You’ll get further next time.

I hope to see you at the PASS keynote! To register, click here: 

WPC Day One: Translating Digital Transformation into Solutions

I blogged over at my ‘official’ company blog about strategic considerations regarding Digital Transformation. There is a lot of messaging directed at sales, partners and CEO level conversations. For the techies, however, how does the strategy translate into a technical implementation that you can actually deliver, to facilitate Digital Transformation within the organisation? In other words, how do you make solutions that are sustainable and relevant?

Microsoft can help with modern, cloud-based tools and a cloud platform. Partners have the ability to use tools such as Office365, Power BI, Microsoft Flow and AzureML to reduce the integration cost and friction to deliver technical solutions. These partners can speak directly to the digital transformation, and lead it. These tools can form composable units or modules, which can be fitted together to meet business needs directly, thereby facilitating digital transformation.

What are these tools? During the WPC keynote, Ecolabs showed off their solution, which involved Power Bi and Microsoft Flow. Here is the example Microsoft Power BI Solution below:
WPC Day 1 Slides
Microsoft Flow is a new tool, which was used to create some of the workflows to align the productivity processes with the resulting dashboard.

What is Microsoft Flow? Well, it’s a great little app and I think you should take a look. Microsoft Flow allows you to create automated workflows between your business or consumer applications and services and connects them so that you get some action, such as notifications, synchronize files, collect data, and more actions that might be useful to your business.

Why is that useful for a Business Intelligence implementation? Well, it can help to track where your data is going. As someone who often goes into organisations where people have ‘lost’ data or it is hiding somewhere that the business people can’t get it, I see Microsoft Flow as a way forward for Digital Transformation in the business by facilitating the flow of data around the organisation.

You can even create workflows on your mobile device. Here is the Ecolabs example from WPC:
WPC Day 1 Slides
Basically, a Flow connects your web services, files, and cloud-based data to save time and effort for everyone, every day.

It’s good to see that Microsoft are a much more open organisation these days; I think that Microsoft Flow is evidence of the open attitude towards other companies, organisations and methodologies that are outside of the Microsoft corporate boundary. In particular, I am a huge fan of Wunderlist and they mentioned it yesterday during the Day One keynote. I know that Wunderlist have been bought by Microsoft and I hope that Wunderlist will appear in Office365 soon, such as in Outlook.

How does Flow work? Well, you start with a template, which gives you a great head start. Why not give it a blast? If it means you get to use Wunderlist as well for all of your lists, and start to love it, then you can thank me!


You could even use Microsoft Flow for new Github issues, and send a notification to Slack. Or perhaps you could use Flow so that you retain Dropbox as your file storage system, integrated with Office365. The examples are endless, I think.

All this shows that the cloud is a great enabler, and a platform, which partners and companies can use in order to make their organisations more productive and collaborative. These are simple examples, and I’m sure that you can think of more! The integrations all happen in the cloud, and it is one way that the cloud can be used as a tool for Digital Transformation.

5 Things I need you to do if you want me to nominate you for an MVP Award


It’s great to see so many people want to participate in the MVP Program. I find that I’m being asked fairly frequently at the moment – say, a couple of times a week – by community individuals if I will nominate them.

Here are some disclaimers:

  • I have no influence over the MVP Program at all
  • I consider myself lucky to be part of the Program. It is a gift, not an entitlement, and it can be gone at any time.
  • The people who nominated me were not my friends, apart from one person (thank you Andrew!). These were generous people who gave their time to nominate me, and it turns out I was nominated by a lot of people, over a period of time, before I got the Award.

I tend to be happy to nominate people if they ask me; after all, it’s not my decision and it may be good for the Program as well as the individual. From my experience, it wasn’t my community ‘friends’ who nominated me, it was people who didn’t know me very well but they could see that I was making a positive difference in the community. I see the MVP Award as a ‘golden ticket’ to do even more positive things for the community; it is about being other-centered, and not self-centered, I think.

OutliersI don’t see that I am an expert now I’ve been given an Award. Throughout his book Outliers, Gladwell repeatedly mentions the “10,000-Hour Rule”, claiming that the key to achieving world class expertise in any skill, is, to a large extent, a matter of practicing the correct way, for a total of around 10,000 hours. What he doesn’t say is the next step: the world is moving so fast, you have to keep working all the time to stay on top. So that means that other things sometimes have to be let go.

My brother, a wise man, once told me that ‘it’s lonely at the top’ when I complained about the number of ‘real’ friends I’d lost, particularly over the past two years. Although I don’t see myself at the top of anything (unless it is a complete mess) I see that, sometimes, other people do; and that’s why they ask for the nomination. If I can inspire someone to do good things for the community, then that’s a good thing for me. In fact, leaders should leave a plan and a structure behind them in their trail; good leaders look at what they leave behind them, as well as looking far forward into the future.

I do nominate people myself, and sometimes I’m lucky that they get the Award after one or more nominations e.g. Stephanie Locke, Mark Wilcock, Ryan Adams and Mark Broadbent, but sometimes I nominate and it doesn’t happen for the nominee. I do try to nominate people who I can see are in my ‘trail’ and hopefully, if anything, my life will serve as a cautionary tale and a ‘teachable moment’ for others.

So, what do I need you to do for me? Tell me, in your own words:

  1. Your community activities. Please list them out for me. Don’t assume that I know. I don’t remember what I did, last week. I certainly will have very little clue what you did, even if you were with me.
  2. What you think you’d contribute to community life for Microsoft, their product groups, and the people who work at Microsoft. They are people too and I love most of the ones that I come across. Be generous with your time with Microsoft people too; don’t assume that, because they work for a massive company, that they aren’t under pressure or really busy. Trust me. They are. Don’t criticise without offering to help first.
  3. The area of expertise you think you bring to the MVP Program. I know we are all Data Platform these days, but it makes things simple.
  4. What would you like to do for the Program?
  5. Tell me more about you. Help me to find a thread that makes you unique, and stand out a little.

I know it seems a lot. I’m busy and i need help filling out the form, and I want to do a good job for you. If you can’t be bothered to give me these things, well, you can’t really expect me to spend hours collating all of this information for you! I can tweak it so it’s good English (for example) but you will help me a lot of if you can be your own voice. I don’t want to miss something out, because I forgot to put something in.

What you could do in return; say thanks to me, ask other people to nominate you too, and, most of all, nominate people yourself. Be generous with your time.

Help me to help you.
Image from page 311 of "Greek athletic sports and festivals" (1910)