Data Preparation in AzureML – Where and how?

messy-officeOne question that keeps popping up in  myc customer AzureML projects is ‘How do I conduct data preparation on my data?’ For example, how can we join the data, clean it, and shape it so that it is ready for analytics? Messy data is a problem for every organisation. If you don’t think it is an issue for your organisation, perhaps you haven’t looked hard enough.

To answer the question properly, we need to stand back a little, and see the problem as a part of a larger technology canvas. From the enterprise architecture perspective, that it is best to do data preparation as close to the source as possible. The reason for this is that the cleaned data would act as a good, consistent source for other systems, and you would only have to do it once. You have cleaned data that you can re-use, rather than re-do for every place where you need to use the data.

Let’s say you have a data source, and you want to expose the data in different technologies, such as Power BI, Excel and Tableau. Many organisations have a ‘cottage industry’ style of enterprise architecture, where they have different departments using different technologies. It is difficult to align data and analytics across the business, since the interpretation of the data may be implemented in a manner that is technology-specific rather than business-focused. If you take a ‘cottage industry’ approach, you would have to repeat your data preparation steps across different technologies.

dt960131dhc0

When we come to AzureML, the data preparation perspective isn’t forgotten, but it isn’t a strong data preparation tool like Paxata or Datameer, for example. It’s the democratization of data for the masses, yes, and I see the value it brings to businesses. It’s meant for machine learning and data science, so you should expect to use it for those purposes. It’s not a standalone data preparation tool, although it does help you partway.

The data preparation facilities in AzureML can be found here. If you have to clean up the data in AzureML, my futurology ‘dream’ scenario for AzureML is that Microsoft have weighty data preparation as a task, like other tasks in AzureML. You could click on the task, and then have roll-your-own data preparation pop up in the browser (all browser based) provided by Microsoft or perhaps have Paxata or Datameer pop out as a service, hosted in Azure as part of your Azure portal services. Then, you would go back to AzureML, all in the browser. In the meantime, you would be better trying to follow the principles of cleaning it up close to the course.

crisp-dm_process_diagramDon’t be downhearted if AzureML isn’t giving you the data preparation that you need. Look back to the underlying data, and see what you can do. The answer might be as simple as writing a view in SQL Server. AzureML is for operations and machine learning further downstream. If you are having serious data preparation issues, then perhaps you are not ready for the modelling phase of CRISP-DM so you may want to take some time to think about those issues.

Keeping the golf score card after 20 years in IT; reflections on International Womens’ Day

On International Womens’ Day, I think about my journey how I got here today. Other women may have similar experiences. Unlike Jenni Murray, I believe that you can be trans, proud and a real woman. Just saying.

Over the years, I have had challenges as a female IT consultant. Here are a few choice examples taken from my 20 years working in IT:

– a business contact once rubbed his hand up my leg when he thought I was asleep on a plane next to him. I jumped out of my skin. I realise now that he was trying to figure out if I wore tights or suspenders, and he was looking for the ‘line’. The skinny; I wear black 40 denier tights because I have varicose veins on my legs like a roadmap. They are comfortable and I like them. Oh, and don’t wake me up when I’m sleeping… although I know that he didn’t mean to…..
– Whilst at a conference, some business contacts trying to keep a golf score card and challenging other colleagues to try to get me into bed, using the golf score card to keep track of points when someone spoke to me or double points if I accepted a drink (for example).  I found the golf score card with the names and scores on it when one of them dropped it on the floor. I was extremely humiliated, and realised why folks had been so friendly and welcoming to me. They grabbed the card back, but I wish I still had it – to remind me.
– I’ve had my work actively sabotaged by someone who told my boss that he could not fathom the idea of senior female tech lead and genuinely believed I got the role for being female and to tick boxes;
– I’ve been told to my face that I am ‘not close enough to the kitchen sink’. Unfortunately for them, I was made their technical lead one month later on merit, and they had to put my presence in their pipe and smoke it. I was gracious about it since  I needed them to deliver well for me since the results would prove my worth. They delivered well, and I delivered the whole project on time, on budget,and to spec.
– I’ve had my email mailbox deleted on one site because I was the only female out of 200 plus men, and I ‘destroyed the all male equilibrium’ of the IT department. That was escalated to C-level, and nobody spoke to me after that because I’d complained. You can’t win, can you? I needed to deliver the project, and needed email. I did deliver, on time, on budget, to spec with a kindly Project Manager forwarding me emails to another account so I had everything I needed, and ensured I wasn’t cut off email trails.
– I am usually the victim of someone discrediting me as being too ’emotional’ and/or ‘not technical’.
– I’ve had men refuse to share an office with me in case I am ‘unclean’
– Discussions of female sanitary items; do I prefer ‘wings’ or not?

It’s the small things; for example, not responding to your email on a thread, but to the second-last email on a thread so that your contribution is cut out. Yes, I see you… but so does everyone else. Not clever and easily provable.
Sometimes it is not overt; it can simply be that I’m mansplained, or interrupted constantly.  It’s a case of people simply never having the capability of believing that women can do anything technical and they will glibly reconcile it as other ways e.g. I am a ‘statistical oddity’. I like that, actually.

This doesn’t include the hugs where the hands just goes a little too low, or the colleague who leans towards me a little too close, or who looks at a part of you for a little too long. You don’t have to be attractive or pretty to experience that.

Here are some takeaway actions for you:

Shout louder to get your voice heard. Your voice is a good one. If people are tone deaf at the start, you haven’t lost anything anyway!
Throw your light out farther, and help others do the same. All of the setbacks have made me simply want to throw my light out farther. So, this blog post here was the result of a meeting that day, where I was being discredited subtly 
Be helpful; you’ll be nicer to work beside, you’ll get more projects and more ‘wins’ in the long term. In the situations above, there was usually someone good enough to help. Be that person.

Be communicative; be the person who forwards email trails to anyone who has been actively cut out of it (male or female!)

Be considerate: the person who considers promoting the quiet female on the team. She’s probably good, you know.

Anywhere can be Trump’s Locker Room. Trump’s talk of pussy-grabbing and locker room? He was wrong to say that, and wrong to say it was ‘locker room’. It can be anywhere. Offices, work parties, conferences, anywhere.Be the person who steps in and takes away the ‘golf score card’ when it’s being used to keep a track to see who can get her into bed. Tell them to ‘grow up’. That’s not just locker room talk; that’s everywhere. Be the person who helps to stop it before it starts.

If someone is being victim-blamed as ’emotional’ – the accusation is usually treated the same as a real scandal and people don’t question it. If someone is emotional, is that because they are being bullied, so it becomes a self-fulfilling prophecy? Watch out also for people pressing other people’s buttons, and getting others to do the footwork for them by isolating them. Take a look at these guidelines here and if you are unlucky, you’ll come across someone who does a lot of these behaviours; if so, stay away from them but try to limit their impact on others, too.
On the plus side, the people who let me play in their box are usually wonderful people; they accept me and value me for who I am, and what I bring to the table. I think that this is why my projects are successful; many people discriminate right at the start and I don’t get beyond the starting block so they get partialled out. So it’s only the nice people who see past the five feet two frame and look at how I can help them out.

For the people who do give me a chance, they bless me in all sorts of ways. These people are men and women, cis gendered and transgendered. I’d like to thank all of my customers for being genuine and wonderful.

To summarise, I have less of a ‘range’ to play in, and I have to fight more and longer to get heard, and I have to be ten times as good to get to the same place.

Be brilliant – try your best. Own what you do and love. Then, you’ll be brilliant all by yourself.

 

Upcoming Events

UK Azure Group, SQL Midlands edition

  • Thursday, February 9, 2017
  • 6:30pm 7:30pm
  • Aston Manor Academy

 

To Register: https://www.eventbrite.co.uk/o/sql-midlands-6264475503

Implementing NHS Azure Hybrid Architectures

The NHS is undergoing a time of unprecedented change, as well as increasing financial pressure under a public microscope. In order to meet these challenging requirements, NHS South London and Maudsley is undergoing a Digital Transformation program which is fundamentally ultimately altering delivery of its healthcare services. The Digital Transformation is crucial to the success of the Trust, and it affects everything from the physical layer right up to self-service reporting. It is also an important balancing act between highly sensitive patient privacy in a world that expects data on-demand in mobile and external environments.

In this technical session, join us to learn from the expert team who architected, designed, and delivered the hybrid Azure cloud and SQL Server solution for NHS South London and Maudsley Trust. Learn about the technical constraints and challenges and how we overcame those challenges, particularly through a healthcare lens of highly-sensitive patient privacy issues in a world of data. You will also learn about the technical benefits that were gleaned from this hybrid implementation. In order to bring the achievements to life, you will see real-life insights into healthcare in a Power BI demo, in use by hospital team members.

Using Azure, SQL Server and Power BI means that the NHS is empowered to create enriched opportunities for research to improve patient outcomes, both now and in the future, as well as directly improved patient outcomes now. Join us for this technically-oriented session to see how Azure, SQL Server and Power BI joined forces to fundamentally deliver improved patient healthcare, research and insights in London.

To Register, visit http://www.sqlmidlands.com/events/48-9th-feb-2017-nhs-in-azure-data-factory-custom-activity.html

 

Power BI for the CEO

  • Thu, Apr 6, 20179:00am Sun, Jun 4, 201712:59am
  • The International Centre

Date: Saturday 8th April, time to be determined

Location: The International Centre, St Quentin Gate, Telford, Shropshire, TF3 4JH

Register: http://sqlbits.com/information/registration.aspx

Digital Transformation is much more than just sticking a few Virtual Machines in the cloud; it is real, transformative, long-term change that benefits and impacts the whole organisation.
Digital Transformation is a hot topic with CEOs and the C-level suite, renewing their interest in data and what it can do to empower the organisation.
With the right metrics and data visualisation, Power BI can help to bring clarity and predictability to the CEO to make strategic decisions, understand how their customers behave, and measure what really matters to the organization. This session is aimed at helping you to please your CEO with insightful dashboards in Power BI that are relevant to the CxO in your organisation, or your customers’ organisations.
Using data visualisation principles in Power BI, we will demonstrate how you can help the CEO by giving her the metrics she needs to develop a guiding philosophy based on data-driven leadership. Join this session to get practical advice on how you can help drive your organisation’s short and long term future, using data and Power BI.
As an MBA student and external consultant who delivers solutions worldwide, Jen has experience in advising CEO and C-level executives in terms of strategic and technical direction.
Join this session to learn how to speak their language in order to meet their needs, and impress your CEO with proving it, using Power BI.

Data Visualisation Lies and How to Spot them Techorama

  • Mon, May 22, 20179:00am Wed, May 24, 20175:00pm
  • Kinepolis

Register: Buy Tickets

 

During the acrimonious US election, both sides used a combination of cherry-picked polls and misleading data visualization to paint different pictures with data. In this session, we will use a range of Microsoft Power BI and SSRS technologies in order to examine how people can mislead with data and how to fix it. We will also look at best practices with data visualisation. We will examine the data with Microsoft SSRS and Power BI so that you can see the differences and similarities in these reporting tools when selecting your own Data Visualisation toolkit. Whether you are a Trump supporter, a Clinton supporter or you don’t really care, join this session to spot data lies better in order to make up your own mind.

Taming the Open Source Beast with Azure for Business Intelligence

  • Saturday, June 17, 2017
  • 9:00am 5:00pm
  • Trinity College, College Green, Dublin 2,

Location: Trinity College, College Green, Dublin 2, Dublin, County Dublin, Dublin 2, Ireland

Register: https://www.sqlsaturday.com/620/registernow.aspx

Today, CIOs and other business decision-makers are increasingly recognizing the value of open source software and Azure cloud computing for the enterprise, as a way of driving down costs whilst delivering enterprise capabilities.

For the Business Intelligence professional, how can you introduce Open Source into the Enterprise in a robust way, whilst also creating an architecture that accommodates cloud, on-premise and hybrid architectures?

We will examine strategies for using open source technologies to improve existing common Business Intelligence issues, using Azure as our backdrop. These include:

– incorporating Apache projects, such as Apache Tika, for your BI solution
– using Redis Cache in Azure in as a engine as part of your SSIS toolkit

Join this session to learn more about open source in Azure for Business Intelligence. Open Source does not mean on premise.
Demos will provide practical takeaways in your Business Intelligence Enterprise architecture.

IT Pros Roundup: Windows as a Service

Not sure what Windows as a service is? With Windows 10, Microsoft moved to deliver Windows as a service which introduces a new way for how it’s built, deployed and serviced.

Start by viewing this 5-minute video demo where Microsoft demystify the core components of the Windows as a service model. I’ve put the video here:

Terminology you should know

  • Feature updates add new features to Windows 10, delivered in an agile manner
  • Quality updates are released monthly and are cumulative.
  • Servicing branches allow organizations to choose when to deploy new features.
  • Deployment rings are groups of devices used to initially pilot, and then to broadly deploy, each feature update in an organization.

Then review this quick guide to the most important concepts and delve into detailed guidance to help you manage Windows 10 updates in your organization.

Microsoft Vendor process – some issues and how I resolved them

As you will know, I  have been working extremely hard on the UK Power BI Summit.  One thing to note: If you are thinking of setting up a community event, I would recommend that you engage with Microsoft as a vendor. It will put your conversations on a more equal footing. I am still not a bona fide vendor because some of the Microsoft offices still cannot ‘see’ me, and it will help you to have the vendor status.

I am not a vendor, and this became the equivalent of a ‘mute’ button when i was trying to engage people in my event –  people couldn’t do anything to help, because I was not a vendor. So I was effectively ‘muted’.  I wanted to try and do the best possible for my event, so I decided to go through the process to get past the ‘mute’ button that was pressing down on me.

Becoming a Microsoft vendor is not a quick or easy process and I want to help you, dear Reader, so it is easier for you than it is for me.  The reasons for the complexities in the process are as follows:

Becoming a Microsoft vendor means that you need a Microsoft account, which is fine. Now, there is a well-documented difference between a Microsoft account and an Office 365 account. There is a fantastic blog about it here. Now, the problem occurs if  you have a Microsoft account that has the same name as an Office 365 account. The vendor system will reject you if you have a Microsoft account that has the same name as an office 365 account, and the two are linked together.

When you go through the Microsoft vendor process with an email address that is identical for a Microsoft account and an Office 365 account, then it bounces you out, and your application for becoming a vendor is rejected. The whole Microsoft Office 365 / Microsoft Account issue, differences and distinctions are terribly confusing.

What do you do? You need to restart the vendor process again, but you need to use a Microsoft account that does not have an identical Microsoft Office365 account LINKED to it.

So here is what I did:

I attempted to go through the process by using an email address that was a Microsoft Office 365 account and a Microsoft account. For the purposes of illustration, let’s call it jen@jenstirrup.com which is the login for both Office365 and the Microsoft account.

After a week or so, I got an email to say that my application was rejected.

I spent another few weeks raising it as a support issue, and eventually told that I was to restart the process, and to speak with the Microsoft sponsor to restart the process. Now, I didn’t want to restart the process without understanding WHY it was rejected so I did my own research because I’m a clever girl like that, and I had a hunch that it was due to the login (no evidence, just a thought that I decided to follow up). Here is what I did:

I took my Microsoft Office 365 email address which I will mark in orange so it is less confusing!!! (jen@jenstirrup.com), and de-linked it from the Microsoft Account settings page for jen@jenstirrup.com – I will mark the Microsoft Account in blue.

I then tried to set up a separate Microsoft account with the same e-mail address jen@jenstirrup.com 

This attempt was refused, and I was told to wait 48 hours.

I waited 48 hours, and, after a few attempts of setting up a New Microsoft Account for jen@jenstirrup.com with the same Microsoft Office 365 account(jen@jenstirrup.com), the new Microsoft Account jen@jenstirrup.com was set up. I will set this to be the colour green. There was no magic here; I just kept trying until it worked. This meant I had three accounts now:

  • jen@jenstirrup.com – my Office 365 account
  • jen@jenstirrup.com – the original Microsoft Account which then reverted to an outlook address
  • jen@jenstirrup.com is my new Microsoft account which I set up for the Vendor process, which wasn’t linked to anything.

With my new Microsoft account,  jen@jenstirrup.com, I could then go and restart the Microsoft vendor process again. The confusing thing is that they all have the same name and I have tried to clarify it by giving them different colours to show that they are different accounts, just called the same thing. 

After my new  jen@jenstirrup.com was set up, I then restarted the Microsoft Vendor process again, over at payment central.

Once again – the process failed.

During the Microsoft vendor process, the system asks for your IBAN number, and then it does a lookup to get your details. You will have an IBAN number, but your bank does not always display it for you. Therefore, make sure that you have it.

Unfortunately for me, when I was going through the application service, Microsoft’s IBAN lookup / verification process failed. This produced a ream of error messages about an unrecognised IBAN. I have been through lots of similar vendor processes for other large companies, and I knew for a fact that my IBAN number was correct and confirmed. I didn’t type it in wrong, in case you think I am that dumb 🙂

I waited until the next day and tried again, and this time, the process seemed to work.

I then checked with the Microsoft vendor people, and I was told over the process of a week, that my process hadn’t gone through, and they could not see me set up in the system. To be fair, I had not had an ‘Welcome’ email to indicate that i was a vendor yet.

I double checked the vendor system, which showed success – but nobody at the Microsoft vendor offices, in the US or in Dublin, could find me. Super confused!!

I logged back into the Microsoft payment system, and I noted that I now had a number next to my company name. Rather than ask  the team to search by name, I gave them the number. After another few days delay, someone in the US has come back to tell me that they can see me in their system, and they sent me a screenshot of my details.

I noted that the ‘search term’ for my company name was not the same as my actual company name; it was a contracted version. So, instead of ‘Data Relish’ it comes up as ‘DATA RE’. This would explain why the US and Dublin teams could not find me, and I was lucky to spot the number next to my name.

I have sent the screenshot to Microsoft UK who say that they can’t find me, and it may take another few days for the details to come through to them. So, although I still have a wait, I feel I am getting somewhere.

Lessons learned:

The Microsoft Office365 / Microsoft Account setup is confusing. It is affecting all sorts of systems, including the vendor account system.  I really hope it gets sorted.

I now have two Microsoft accounts: jen@jenstirrup.com (the original) has now reverted to an outlook.com address. I have my new shiny Microsoft account   jen@jenstirrup.com  which is completely on its own, and it is not connected to anything – it remains my vendor account address. My office365 address remains unchanged.

Although it’s too late for my current event, I think it is easier to engage as a vendor. I understand that other community bodies do this, and I was not on the same footing as them. I am not sure if I will ever arrange another event again but at least I will have this vendor footing now.

Recommendations:

  • If you are going through the Microsoft vendor process, set up a new Microsoft account dedicate to the vendor process and being a vendor.
  • I should have set up an account such as jen_vendor@jenstirrup.com and this would have been ok.
  • Be prepared for unexpected issues, like the IBAN problem. This means you should get your application done as early as possible.
  • Be patient. Lots of big systems, and some Azure AD issues somewhere.
  • Keep asking for updates. Everyone else is probably confused, too!

I hope that helps and that, you, dear Reader, have a successful experience in the vendor setup process. I’m blogging this in order to save you from some of the issues that I had.

 

 

 

 

 

 

 

 

 

Data Acumen – Analytics Literacy in a Data-Driven world

I am sharing this blog by Richard Lee The video of his interview is below. I agree that data literacy has to be part of any leadership conversation.

We talk about data in terms of technology, and this is usually how customers approach me. I agree with Richard’s insight that it has to be recast as a business problem solving venture with an emphasis on data acumen as well as business acumen.

In this era of ‘alternate facts’, ‘post-truth’ our need for data acumen will become a necessary part of business acumen.

Enough of my thoughts; I will let Richard speak, and you can read his insightful blog post below.

Preface: I did not write a formal posting on the Data for Policy confab this past September, but wanted to at least share the materials that I presented and discussed during the conference. Abstract: The notion of Data-driven Policy making and its associated Governance, is often challenged by the fact that the vast majority of Politicians, […]

via You can’t have Data-driven Policy if your Leaders are Analytics Illiterate” — infomgmtexec

Why UK Power BI Summit? Derive business value from your data

I’ve created UK Power BI Summit in response to an industry need for Power BI to have its own event, and I hope to produce a repeatable model for other Power BI groups globally. I am working with Microsoft in Redmond at the moment, in the hope that I can spread the world globally about the power of enabling businesses through data, via easily-accessible tools.

What’s the rationale? Personally, the next step in my career is to continue my trajectory from the data center towards boardroom level leadership and consultancy, in order to help organisations become 21st century, data-driven organisations. Data is at the foundation of businesses. Data, in turn, leads to insights and better decisions that improve the business. Ideally, businesses should have data as part of their DNA. This does not mean that there is not a place for context or for ‘gut instinct’. Data gives businesses new insights, and, in turn, it gives them new options.

tumblr_lxrqzlzskr1qdo62to1_500

My favourite bookshop in the world: The Strand Bookstore, New York, on the corner of Broadway and E 12th Street.

With my business and technical skills in mind, I am doing my MBA at this stage in my career to focus on building businesses as data-driven organisations. The MBA will help me to combine my technical and business expertise within an established framework that will help me to be more effective in a leadership role. I believe that the MBA will help me to articulate and achieve a strategic viewpoint, which, in turn, will help businesses to use their data more effectively.

I am not alone in this data-driven journey. My industry experience tells me that many organisations suffer from one thing: hype about the possibilities and opportunities in data, and, particularly Big Data, but they don’t know how to get started in terms of technology, people, and enabling business processes that would consume these services.

Organisations can find it difficult to know where to start, or even how to start. Very often, businesses simply store all of their data, rather than think proactively about the data that they have, and how they could use it. As businesses continue to get excited about the opportunities of Big Data, they will also need Data Thought Leadership in order to guide them effectively towards success.

Digital Transformation is a much bandied about term. It isn’t simply whacking a few Virtual Machines in Azure, moving data to the cloud and – yay – digital transformation. It’s about transforming the business through the use of technology, and it has the business at the front-and-center of the activity.

Now is the time for businesses to bring their data and their strategy together, using the latest technologies – but they can’t do that, until they see their data. This is where Power BI come in.

Processed with VSCOcam with hb2 preset

The Power BI event is aimed at those people in the organisation who are aware of business needs, user needs, and have winning ideas and who are willing to learn about user-oriented technology to make that happen. The event is aimed at helping these people to learn about the technology from beginner to advanced, according to their needs.

Although the event is about technology, it’s also about the business, and deriving business value from your data. It’s not a straightforward technology event. It’s about the business as well as the technology, and how it’s used. It’s about bringing you along the journey, further.

I thought that the difference between UK Power BI Summit and other events such as PASS SQLSaturday events, SQLBits were fairly clear, but it would seem from my email traffic that my assumption wasn’t correct.

Just to be clear:

  • I am not part of the SQLBits committee and I have nothing to do with their leadership. I don’t represent them and I’m not featured on their promotional video. I’ve been speaking there since SQLBits 7 through to SQLBits XV. You can look for my SQLBits 7 – 15 sessions here.
  • I am part of PASS and a non executive Director, and I sit on the PASS Board as an elected Director. I don’t represent PASS here. If you want a PASS-validated blog, then please head over to their site. This isn’t a PASS event.

Let’s look at the SQLBits mission statement, taken from their site:

SQL Bits was started by a group of individuals that are passionate about the SQL Server product suite. There is a breadth of knowledge in the SQL Community that will benefit everyone in the community. We want to spread that knowledge. We all work with the SQL community, some of us for many years and have all been given the MVP award by Microsoft.

Let’s look at the PASS Mission Statement, taken from their site:

PASS is an independent, not-for-profit organization run by and for the community. With a growing membership of more than 100K, PASS supports data professionals throughout the world who use the Microsoft data platform.

PASS strives to fulfill its mission by:

  • Facilitating member networking and the exchange of information through our local and virtual chapters, online events, local and regional events, and international conferences
  • Delivering high-quality, timely, technical content for in-depth learning and professional development

PASS was co-founded by CA Technologies and Microsoft Corporation in 1999 to promote and educate SQL Server users around the world. Since its founding, PASS has expanded globally and diversified its membership to embrace professionals using any Microsoft data technology.

So, the UK Power BI Summit is ultimately looking at using Power BI to transform businesses, through expertise in the technology, embedded in business-oriented discussions. The technology should support the business in its mission to adapt to the new world of data.

If you’d like to register, click below:

Eventbrite - UK Power BI Summit