10 Golden Rules of MVP Summit

After seven years as an MVP, I’m still starry-eyed because I’m privileged to be part of the program. Thank you Microsoft for having me to the MVP party, and for providing me with this great award. I’ll continue to strive to be worthy of your trust and acknowledgement.

If you’re attending MVP Summit for the first or even the nth time, then there are a few golden rules that might help you to make the most of MVP Summit. Here’s my golden rules. Feel free to add some more in the comments.

Note that this isn’t an official, Microsoft endorsed list. It’s just my thoughts.

  1. Keep the NDA and adhere to the any Microsoft Code of Conduct that’s in place. Essentially, I boil this down to the following ethic: be decent.
  2. Be nice to Microsoft team members. This is huge for me.  I know you’re excited to be an MVP but that doesn’t mean you have the right to give grief to any Microsoft team members. Be polite and respectful when you ask questions and provide feedback.  Leave the ego at the front door; it’s not required. We all want to give Microsoft solutions even more sparkle because we can see the difference that they make to people’s lives every day, and we can help bring that colour to conversations. Microsoft Program Managers have most likely heard your feedback already, from the thousands and thousands of customers who provide feedback. You’re probably not providing ‘news’ but rather confirming something they’ve heard before.  So please make sure to thank your Microsoft community leads, all of the Product Groups, and all of the Microsoft staff who have taken time to present, answer your questions, take your feedback, and generally participate. I mean, everyone. Right down to the people you may not ‘see’, for example, the Microsoft team member who serves your coffee in the morning or takes away your cups. That person got up early, travelled to work, to stand and make your coffee all day. Thank them. Ask them how they ‘re doing. They are part of the event too.
  3. Be nice to the people who look after the shuttles and take you safely back to your hotels. These people stand outside for hours of the day in the Seattle rain, making sure that you get to where you need to be. If your shuttle is a bit delayed, it’s not the end of the world. It will arrive, and you’ll be on your way. But that person doesn’t need grief from you, or the next ‘you’ who comes after you. They’re doing their job and an appreciative ‘Thank You’ costs you nothing.
  4. Go to many events and make friends. Personally, I don’t like crowds but I make a real effort here. I usually find a ‘friendly face. Which brings me to my next point.
  5. Be a ‘friendly face’ for someone. Be that person that reaches out. Be the person that you’d have liked to have met, on your first visit to MVP Summit.
  6. Be gentle when others ask questions. Not everyone has English as a first language, and the diversity aspect is one of the main drivers for my attendance. People may go to a talk that is way out of their normal environment, but they deserve the same chance to learn.
  7. Be constructive in conversations. Microsoft folks work superlatively hard and you can be part of the customer voice if you’re providing feedback. It’s an incredible privilege.
  8. Remember that your individual scenario may not extrapolate to the whole population. That feature you really really want? There are probably other people who really really want another feature. That shouldn’t stop you from asking for a great product feature, but it’s important to acknowledge that they have the ‘long view’ and they make the decisions and they have visibility of things that you don’t see.
  9. Offer to help Microsoft folks. If a team member can’t or won’t answer your question, remember it is a gift that we are even there in the first place. Just offer to help.
  10. Diversity is important. It’s a diverse program and I hope you’ll make the most of the opportunity to make friends from all over the globe, with different perspectives and cultures..

I’ll be attending so I wanted to add one more thing:

ILLGOWithYouI’ll go with you.

Yes, this. If you need it, I’m there for you. #IllGoWithYou. My Twitter handle is jenstirrup if you want to get in touch.


We are part of a privileged group, but it’s important to recognise that it is a gift from Microsoft. I look forward to seeing you there.

Finally –

Enjoy yourself! In the words of Bill and Ted, Be Excellent to each other. And yourself.


Dynamic Data Masking in Azure SQL Datawarehouse

I’m leading a project which is using Azure SQL Datawarehouse, and I’m pretty excited to be involved.  I love watching the data take shape, and, for the customer requirements, Azure SQL Datawarehouse is perfect.

secret-3037639_640 Note that my customer details are confidential and that’s why I never give details away such as the customer name and so on. I gain – and retain – my customers based on trust, and, by giving me their data, they are entrusting me with detailed information about their business.

One question they raised was in respect to dynamic data masking, which is present in Azure SQL Database. How does it manifest itself in Azure SQL Datawarehouse? What are the options regarding the management of personally identifiable information?


As we move ever closer to the implementation of GDPR, more and more people will be asking these questions. With that in mind, I did some research and found there are a number of options, which are listed here. Thank you to the Microsoft people who helped me to come up with some options.

1. Create an Azure SQL Database spoke as part of a hub and spoke architecture.

The Azure SQL Database spoke can create external tables over Azure SQL Datawarehouse tables for moving data into Azure SQL Database to move data into the spoke. One note of warning: It isn’t possible to use DDM over an external table, so the data would have to move into Azure SQL Database.
2. Embed masking logic in views and restrict access.

This is achievable but it is a manual process.
3. Mask the data through the ETL processes creating a second, masked, column.

This depends on the need to query the data. Here, you may need to limit access through stored procs.
On balance, the simplest method overall is to use views to restrict access to certain columns. That said, I an holding a workshop with the customer in the near future in order to see their preferred options. However, I thought that this might help someone else, in the meantime. I hope that you find something that will help you to manage your particular scenario.

How do you know if your org is ready for Data Science? Starting your journey with Azure Databricks

Mentioning  data science at your company may give you an air of expertise, but actually implementing enterprise-wide transformation with data science, artificial intelligence or deep learning is a business-wide transformation activity. It impacts your data and analytics infrastructure, engineering and business interactions, and even your organizational culture. In this post, we will look at a few high-level things to watch out for before you get started, along with a suggestion that you can try Azure Databricks as a great starting point for  your cloud and data science journey.

Note: not all companies are ready for data science. Many of them are still struggling with Excel. This article is meant for you.

So how can you move forward?

1. Have a data amnesty

If you’re still struggling with Excel, then data science and AI can seem pretty far off. Have a data amnesty – ask everyone to identify their key data sources so you can back them up, protect them, and share them better where appropriate.

2. Determine the Data (Im)maturity in your organization.

Take a peep at the following table: where is your organization located?

Democratization of Data

Note: this idea was inspired by Bernard Liautaud

Ideally, you’re headed towards a data democracy, where IT are happy in their guardianship role, and the users have got their data. If this equilibrium isn’t in place, them this could potentially derail your budding data science project. Working on these issues can help your success to be sustainable in the longer-term.

3. All that glitters isn’t data gold

This is the ‘shiny gadget’ syndrome. Don’t get distracted by the shiny stuff. You need your data vegetables before you can have your data candy.

Focus on the business problem you’re trying to solve, not the technology. You will need to think about the success criteria.

You should be using the technology to improve a business process, with clear goals and measurable success. Otherwise it can be disorganized, with a veil of organization that is disguised by the technology.


4. Fail to plan, plan to fail

If you fail… that’s ok. You learned ten things you didn’t know before. Next time, plan better, scope better, do better.

How to get started?

Starting in the cloud is a great way to get started. It means that you’re not purchasing a lot of technology and hardware that you don’t need. Abraham Maslow was once quoted as saying “If you only have a hammer, you tend to see every problem as a nail.” Those words are truer than ever as an increasingly complex and interconnected world makes selecting the right tools for the data estate. With that in mind, the remainder of this blog talks about Azure Databricks as a step for data science for the new organization in order to reduce risk, initial outlay and costs.



What is Microsoft Azure Databricks?

default-open-graphAzure Databricks was designed in collaboration with Microsoft and the creators of Apache Spark. It is designed for easy data science: one-click set up, streamlined workflows and an interactive workspace that enables collaboration between data scientists, data engineers, and business analysts. Each of this roles will have a different style describes how users want to interact with, present, and share information, bearing in mind the varying skillsets of both business users and IT.

So what is Apache Spark? According to Databricks, Apache Spark is the largest open source process in data processing. From the enterprise perspective, Apache Spark has seen rapid adoption by enterprises across a wide range of industries.

So what does Apache Spark give you? Apache Spark is a fast, in-memory data processing engine. For the serious data science organisation, it allows developers to use expressive development APIs to work with data. For information and data workers, they have the ability to execute streaming analytics, longer-term machine learning or SQL workloads – fast. Implemented in Azure, it means that the business users can use Power BI in order to understand their data better.

Apache Spark consists of Spark Core and a set of libraries. The core is the distributed execution engine and the Java, Scala, and Python APIs offer a platform for distributed ETL application development. Spark lets you quickly write applications in Java, Scala, or Python. It comes with a built-in set of over 80 high-level operators. And you can use it interactively to query data within the shell.

The Apache Spark functionality is incorporated in Azure Databricks

In addition to Map and Reduce operations, it supports SQL queries, streaming data, machine learning and graph data processing. Developers can use these capabilities stand-alone or combine them to run in a single data pipeline use case.

It supports in-memory processing to boost the performance of big-data analytic applications, and it works with other Azure data stores such as Azure SQL Data Warehouse, Azure Cosmos DB, Azure Data Lake Store, Azure Blob storage, and Azure Event Hub.

What is so special about Apache Spark, anyway?

For the enterprise and data architects, it can give you the opportunity to have everything in one place: streaming, ML libraries, sophisticated analytics, data visualization. It means that you can streamline in one technological umbrella, but have your data in other data sources such as Azure SQL Data Warehouse, Azure Cosmos DB, Azure Data Lake Store, Azure Blob storage, and Azure Event Hub.

As an architect, I aim to reduce points of failure and points of complexity, so it is the neatness of the final streamlined technology solution that is appealing.

It is also fast, and people want their data fast. Spark enables applications in Hadoop clusters to run up to 100x faster in memory, and 10x faster even when running on disk. Spark makes it possible by reducing number of read/write to disc. It stores this intermediate processing data in-memory. It uses the concept of Resilient Distributed Dataset (RDD), which allows it to transparently store data on memory and persist it to disc only it’s needed. This helps to reduce most of the disc read and write the main time consuming factors of data processing.


Data Visualization for the Business User with Azure Databricks as a basis

Azure Databricks brings multi-editable documents for data engineering and data science in real-time. It also enables dashboards with Power BI for accurate, efficient and accessible data visualization across the business.

Azure Databricks is backed by Azure Database and other technologies that enable highly concurrent access, fast performance and geo-replication, along with Azure security mechanisms.


Implementing enterprise-wide transformation with data science, artificial intelligence or deep learning is a business-wide transformation activity. In this post, there is the suggestion that you can try Azure Databricks as a great starting point for  your cloud and data science journey, with some advice on getting a good ground before you start.