Fun DataDive with DataKind UK

This weekend, I volunteered with DataKind UK on their Summer DataDive, which took place on the weekend of 28th and 29th July 2018 in the Pivotal London offices in Shoreditch. I had a fantastic, memorable weekend, mixing with around 200 other data scientists.

I’d like to thank the DataKind team for being so inspirational, giving, and kind with their time and skills. I’d like to emphasise my absolute admiration for the Data Ambassadors and the work that they do to lift everyone up.

Why did I do this? DataKind appealed to me since it meant that I could sharpen my data science skills  by pitching in with experts. New learners to Data Science are welcome, and there were also newbies who had some experience of data and wanted to know more. There was room for everyone to contribute, so if you are a newbie, it would be a great way to join in the conversations and learn from experts who love what they can achieve with data. Plus, it’s a great opportunity to mix with real data scientists. This isn’t Poundland data science, and this is not pseudo Data Science. This is the real thing; and I spent two days immersed in real problems using Data Science as a solution. I learned a lot, and I contributed as well. There is a saying that you are the average of your friends, and I needed to get close to more Data Scientists so that I could build on my earlier experience on AI and bring it up-to-date.

I wanted to help a charity, by dedicating my time and skills, to support women and girls who need it. I understand that there are vulnerable men too; but this isn’t about whataboutism. Women and girls are disproportionally affected by issues such as domestic violence and being the victims of sexual crimes, and I wanted to do something practical to help.

Lancashire Women's Centres LogoFor my specific contribution, I was working with a team of 25 other data scientists, we worked on finding insights in data belonging to Lancashire Women’s Centre. The vision of Lancashire Women’s Centre is that all women and girls in Lancashire are valued and treated as equals. Their aim is to empower women and girls to be able to transform their lives by bringing them together to find their voice, share experiences and understanding, develop their knowledge and skills, challenge stereotypes and misconceptions about them so that they can have choices in becoming the individuals they want to be. I share this conviction deeply and I wanted to help.

You may well be thinking that the charity help a small number of women, but that’s not the case at all. They have a real impact in their community. The Lancashire Women’s Centre has helped over 3000 women in the last year. This includes 5807 hours of therapeutic support were accessed by 1154 women and 78 men.  Following therapy: 25% were no longer taking medication, 8% felt the support had helped them find and keep a job, 12% continued to access LWC services to support their recovery.

So what did I do? I can’t share specific details because the data is confidential, and it obviously impacts some of the UK’s most vulnerable women and girls. I will say that the tools used were CoCalc, R, Python, Excel and Tableau and Power BI to work with the data.

DataKind™ brings high-impact organizations dedicated to solving the world’s biggest challenges together with leading data scientists to improve the quality of, access to and understanding of data in the social sector. This leads to better decision-making and greater social impact. Launched in 2011, DataKind leads a community of passionate data scientists, visionary partners and mission-driven organizations with the talent, commitment and energy to use data science in the service of humanity. DataKind is headquartered in New York City and has Chapters in Bangalore, Dublin, San Francisco, Singapore, the UK and Washington DC. More information on DataKind, our programs and our partners can be found on their website: www.datakind.org

Lancashire Women’s Centre

DataKind JenStirrup and Team

I’m the one on the right, wearing orange!

I’m looking forward to the next one!

Tableau Prep, Power Query and Power BI – Good together?

A question I often here is this: Which tool should I use, Tableau or Power BI? The truth is: They are not mutually exclusive.

Tableau is great at business mysteries: ill-defined questions where you have to surf the data for results. Power BI is particularly great at modelling and cleaning the data, with clean, crisp data visualisation and the ability to use custom and open-source data visualizations. This blog isn’t aimed at the technical user, but at the analyst who needs to get information out quickly. I will do another post, aimed at the geeks, another time.

Tableau and Power BI are paintbrushes for your data. The tools do not have to be mutually exclusive. Power BI contains some superb data preparation functions which are aimed at business users. Speaking to customers, however, it’s clear that they aren’t aware of its functionalities. So, I decided to make your life easier for you by helping you to compare the two, using the same data with the same result.

In the first video, we will look at Tableau Prep in some detail. We will use one of the Tableau datasets, Superstore, and we will work through one of Tableau’s own tutorials.
In the next segment, I repeat the exercise using Power BI and Power Query so that you can compare more easily. Ultimately, both tools achieve the same ends.

Where Tableau Prep falls down, in my opinion, is that Tableau Prep does not handle complex pivots very well. In the World Data Bank data, the Life Expectancy data which was made famous by Hans Rosling is available, and this data needs pivoted in order to be visualized effectively. Tableau needs a lot of branching pivots to get it to work. Power BI, on the other hand, pivots it within a few clicks. What do we take away from this?

  • Tableau Prep is great for simple data preparation tasks
  • Power Query, also known as Get and Transform in Excel, is great at simple and much more complex and difficult data preparation tasks.

So, which tool you use will depend on the data prep that you need to do. If it is easy, use Tableau Prep. If it is anything above easy, or includes easy, use Power Query.

You can use Power BI to shape the data, and then use the data in Tableau. See? You don’t have to choose. Select the best paintbrush for what you need to do, and you are not restricted to one paintbrush.

To see the videos, go here:

Tableau Prep: An Overview

 

Power Query and Power BI Together for Tableau Users:

 

 

How do you choose the right data visualisation in Power BI to show your data?

How do you choose the right visualisation to show your data? Usually the customer wants one thing, the business user want something else, the business sponsor wants something flashy…. and it’s hard to tease out the requirements, and that’s before you’ve even opened up Power BI such as Power View, Excel, Tableau or whatever your preferred data visualisation software.

In other words, there are simply too many charts to choose from, and too many requirements to meet. Where do you start?

I found this fantastic diagram which can help you to choose the right visualisation. I’m often surprised to see that people haven’t seen this before. Note: this diagram was done by Andrew Abela of Extreme Presentation and the source is here and his email address is on the slide, so be sure to thank him if you’ve found it useful. If you can’t see it very well, click here to go to the source.

choosing-a-good-chart-09_001

Chart Choosers should not replace common sense, however, and Naomi Robbins has written a nice piece here which is aimed at the wary. However, diagrams like Abela’s can really help a novice to get started, and for that, I’d like to thank him for his work.

How does it related to Microsoft’s Power BI? If you look at the visualisations that are available in Power View, you can see that most of the visualisations in the diagram are available in Power BI.  The ones that are excluded are the 3D graphs, circular area charts, variable width charts, or the waterfall chart.

Why no 3D? I personally hope that Microsoft will leave 3D out of Power BI tools, unless of course it is in Power Map.  With 3D on a chart, it is harder to identify the endpoints, and it can take us longer. It might also mean that points are occluded. If you’re interested and want to see examples, here is one by the Consultant Journal team or you can go ahead and read Stephen Few’s work. If you haven’t read anything by Stephen Few, get yourself over to his site right now. You won’t regret it. Why is it different from Power Map? 3D maps provide context, and they are the exception where I will use 3D for a data visualisation showing business data. I’m obviously excluding other types of non-business data here, such as medical imaging and so on.

Why no circular area or variable width charts? I am not a fan of variable width of circular area because we aren’t very good at evaluating area when we look at charts and graphs, and Robert Kosara has an old-but-good post on this topic here.

This blog is mainly for me to remember stuff but I hope it helps someone out there too.

Best Wishes,
Jen

Data Visualisation: lifting the curse of Cassandra

CassandraInformation is the new currency, the lifeblood of organisations. However, it has to be explored, and evangelised throughout the organisation before it can have any real impact. SQL Server 2012 now helps business users to access the data; a real paradigm shift in the ‘umbrella’ of users who touch SQL Server.
However, does that mean that the users will be believed? The ‘messenger’ of the information can have a great influence on how the information is – or is not – propagated throughout the organisation. Sometimes, people are simply not believed, or their ideas entertained. This may be due to the way that they put the message across, or simply due to the fact that they can’t get the message to the right people without upsetting the apple cart. 
This is about the person (or group) in the organisation, who might meet one of these criteria:
  • see a ‘train crash’ going to happen in the organisation, but can’t should loudly enough to avert it happening.
  • see room for improvement in the business, but find it hard to get their message across
  • have a ‘gut feel’ about what customers are telling the enterprise, but find it hard to prove, demonstrate or research this ‘gut feel’

This leads to the Cassandra Complex.  Quick history lesson: according to Greek mythology, Cassandra was the beautiful daughter of King Priam and Queen Hecuba of Troy. She refused the advances of Apollo, who set a curse on her: that she would always tell the truth, but never be believed.
There are parallels with this mythological figure in the workplace, which may engender your sympathy or empathy. You might see this in yourself or in someone else. Do you see the ‘train crash’ in the organisation before it happens, but can’t get the message cross? Do you see patterns in the data, and find it hard to evangelise your findings throughout the organisation?
If so, you could be the ‘Cassandra’ in your organisation, or a customer or associated company, for example. It is tremendously frustrating to see issues in the organisation, but not get the message across. So, if you see someone banging their head against a wall, trying to show problems before they take hold: they do this because they care, but perhaps that isn’t the best way to get the message across.  It also helps business users to test out their theory by allowing the users to explore their findings properly, before publicising them.
A better way to get the message across is to research, demonstrate and uncover the findings in the data using data visualisation technology such as Power View, which can help. It is the new part of SQL Server 2012 which allows users to touch their data; it isn’t just about techies any more. By showing the ‘truth’ of the data, hopefully this would cure the curse of Cassandra: to be heard and also to be believed. Visualising data can bring insights, and attention, into data that can show where the problems reside in the organisation.
Data Visualisation can help to make Cassandra speak – and be believed. 
Sometimes people need to ‘see’ the problem before they understand it. Data Visualisation makes the insights accessible. It’s harder to ignore Cassandra if the the message is shown inescapably to all, particularly when it’s right in our favourite Office tools such as PowerPoint or Excel.
Making data insights accessible means that data visualisation are used to make analysing data simple, assuming the data is properly organized, cleansed and sanitised. The beauty of these solutions is that it’s fast to get results, and easy to show them off. If you’re interested in looking at data visualisation technology, then the Gartner Report is a good place to start. I tend to think that the technology should support the business requirements, within the budget set by the organisation for purchasing software. That’s why it’s difficult to recommend one over the other, since the answer is usually ‘it depends’…
Whether or not the organisation works on the insights, of course, is never guaranteed. As always, life isn’t that simple, but it’s a new angle that might help push problems forward and turn them into solutions. 

Tabular models and Tableau

I was recently asked how to connect Tableau to a Microsoft Tabular model. The concept itself is straightforward in Tableau. In my opinion, tabular models will become more prevalent, so I will start to look at them more detail.

It turns out that the individual who questioned me was struggling, unfortunately. The resolution was that he wasn’t including the instance name as part of the connection.

In order to help business users to connect Tableau to the tabular model, I have included a brief video on how to achieve connectivity between Tableau and the Tabular model.

The video does include my Scottish accent, so please feel free to turn down the volume!

The link to the video is here:

I’m blogging this on my iPad so please excuse that I haven’t inserted the video itself! I will do this when I am back online properly. I am on holiday just now, but thought it worthwhile just to get the information out to help the individual ASAP

Update 8th April: video inserted 🙂 enjoy

-Jen

SQLUniversity: PowerPivot, Tableau and Jedi Knights

This blog will show an overview of how I mobilised PowerPivot using Tableau. I’ve previously given this session at SQLBits, NEBytes Microsoft Technology User Group, and SQLServerDays in Belgium but thought it would also be useful to supply the files for you. The steps are very simple since I intended to show the end-to-end solution simply as a proof of concept, as follows:

  • creation of a PowerPivot which mashed up UK Census data with geographical data
  • creation of the report in Tableau
  • deployed to Tableau Public for consumption by mobile devices such as the iPad
The example was deliberately kept simple in order to prove the concept of PowerPivot being mobilised. 
The data sample involved mashing up two sources:
  • Jedi Knight census, data, which can be downloaded from here This is a basic file but the final PowerPivot can be downloaded from a link later on in this article
  • Geonames offer an excellent free download service, which you can access here
The Jedi Knight data, along with the geographical data, were joined using the outcode of the postcode data. If you need more definitions of the UK postcode system, I’ve previously blogged about this here.  
Essentially, a very simple RELATED formula was used in order to look up the latitude and longitude from the UKGeography table, and put it into the Jedi Knights data, and produce the necessary data in a simple Excel table. The formula looks like this:
=RELATED(UKGeography[Latitude])
=RELATED(UKGeography[Longitude])
Once these very simple formula were put in place, it was time to load the data into Tableau.
Tableau can take both PowerPivot and Excel data – which driver to use?  I used version 6 of Tableau. Whilst this version of Tableau does see the PowerPivot correctly as an Analysis Services cube, it does not always read the date as a ‘date’ type, but instead as an attribute. There is a forum posting on the Tableau website which tells you how to fix this issue, which involves changing the date so it appears as a measure, which means it can then be used for trends and so on. 
However, I wasn’t comfortable with this solution because I like dates to be in date format. I’ve also run into this issue at customer site, where the customer wanted to use SSAS as a source and Tableau as the presentation layer. They were data-warehouse savvy and didn’t like the ‘measures approach’ fix. 
On customer site, I got around it instead by using the Excel data source, and importing all of the PowerPivot columns into an Excel 2010 sheet. By doing it in this way, date formats were preserved. In this example, I didn’t have date format so it didn’t matter – but this is a useful tip for the future if you are using PowerPivot with Tableau. The final data, in an Excel PowerPivot, can be obtained in zip format here or if you can’t access it, please email me at jenstirrup [at] jenstirrup [dot] com.
Once the data was accessible by Tableau, I used the Tableau Desktop version to upload the data into Tableau’s memory. I did this so that I could eventually upload the Tableau workbook to Tableau Public. The instructions to save to Tableau Public are given here
Once the data was in Tableau Public, I just needed to access the data using the Safari browser on the iPad. In case you are interested, the demos are publically accessible and you can access the final result by clicking on the hyperlinks below.
I hope that’s been a useful overview of PowerPivot, and the ease of which it was mobilised. This blog forms a use case of how it might be useful to use PowerPivot, since I think that people sometimes need examples of how PowerPivot can benefit them. In this case, the clear benefit of PowerPivot is to provide an easy way of mashing up different data sources.
I look forward to your comments and thank you for sticking with me for the PowerPivot SQLUniversity discussions!

SQLUniversity: Introduction to PowerPivot and Mobile Business Intelligence

Here is the second post for SQL University, where we introduce the topic of mobilising PowerPivot data. As an introduction, I thought it might be useful to share the presentations that I did recently. Tomorrow’s blog will talk more about the details of mobilising PowerPivot data using Tableau, but I thought that these articles might provide some introductory material for people to use, and re-use, in order to expose the usefulness of PowerPivot as a data source. Here is a brief overview:

Essentially,the Ordnance Survey and Census data were mashed up together and linked together in PowerPivot. The PowerPivot was then used as a source to Tableau, which was used to visualise the data.  As an introduction to the process, I generated some presentations to use when I was talking around this solution. and I have provided these presentations here.
Here is a very introductory presentation on PowerPivot in Sharepoint. This presentation was designed to be given in around 5 minutes before being supported a series of demos, and the intended audience was people who hadn’t seen PowerPivot previous to the presentation.

The following presentation came from SQLBits, where I gave a discussion on mobilising business intelligence with PowerPivot. Tableau Software was used in order to display the data from the PowerPivot, and the next blog in the series will present more technical detail on how the solution was built.

Here is the presentation that I produced for SQL Server Days in Belgium in November 2011:

I hope that you enjoy these presentations, which I’ve provided for people to use and enjoy as they wish. Tomorrow’s SQLUniversity post will provide more detail on the mobilisation of PowerPivot with Tableau.