Be Insights Driven! Why we should not just be Data Driven, and definitely not Tool Driven. #PowerBI, #Tableau, #Qlik

I presented at a ‘BI without the BS’ event in London in November 2018. The idea was that there would be three tools and three players; Qlik, Tableau and Power BI, a judging panel, and a live audience of about 100 people. I represented Power BI and it was very clear that only a few people had seen it before.

As a data visualization person, I’m all about making things clear for people and that was why my story was much simpler and the facts were compelling, displayed in Power BI; I told a story which had a call to action.

My interpretation of the rules

My interpretation of the rules was that data storytelling was one of the main judging criteria. I had not seen the event as a product demo, and that’s why I emphasized the insights that Power BI gave me over knobs and levers that each tool can do. Product demos are easy, but not everyone can find insights in data.

Google Chief Economist Hal Varian predicted that data storytellers are the future over a decade ago. With data as the new oil, every company is seeking new ways to monetize their data. So my emphasis had been on data storytelling with a particular tool, but not trying to sell the tool itself as a demo. That’s my boxing ring to play in, and that’s the hard part of working with data.

An Equal Playing Ground for Games?

For my Power BI piece, my insights were all about the complexities in being a woman in gaming. My Power BI analysis showed that women earned just $1.8m in prize money in total, but the top male game winners earned $145m in total. See the tiny pink sliver? That’s the proportion that women earn in gaming competitions. If you’re thinking, that’s not a good dataviz to show, I can barely see the sliver – then you have missed the point. The point is, it is a tiny sliver. For colours, I have used Dark Orchid to represent female players, and Teal to represent male data throughout the visualization.

Female vs Male Prize Money in Games

So there was a huge difference. In case you need the numbers:

card

So there is a huge disparity between male and female earnings. Is this because there are less women players? Apparently not. This chart shows that the gap between male and female players is narrowing:

Male vs Female Game Players by trend
In the data, I noted that the players were ranked from 1 to 100, for each gender. So I pitted the men and women together in terms of rank to show how the first ranked male player earned versus the first ranked female player. Here is the chart, in Power BI:

Prize Money by Gender

You can see that the girls’ earnings are practically flat, whereas the male earnings are vastly higher. The gender gap in pay is real, absolutely real.

I also noted that the lowest ranked player still earned more than double what the highest earning female player did. So the data showed lots of insights; it depended if you were willing to see things that made you uncomfortable.

Then, I wondered if there was a relationship between the number of tournaments played, and the prize money won. So, I used the male data here to see if  a reasonable relationship could be inferred between the number of tournaments played, and the amount of prize money earned. If women are playing less tournaments, then naturally they will earn less. So how did that pan out?

Relationship between No of Tournaments and Game Prize Money

The interesting thing was that the total number of tournaments played (on the X axis) didn’t seem to impact the amount of prize money earned. I’d have to do more analysis but you can see a vague relationship in the hexbin chart, but with a lot of outliers. I might come back and look at that another day, using R and Python or something.

Gaming in Real Life

So my piece was more about exploring the idea that there is no equal playing ground for women in gaming, and that’s certainly borne out by some of women’s experiences in the gaming world. Harassment for women in gaming can involve sexist insults or comments, death or rape threats, demanding sexual favors in exchange for virtual or real money, or even stalking. The GamerGate scandal tells you all you need to know about it, I suppose. Alternatively, you could look at Fat, Ugly or Slutty where women record instances of instances of sexism. Warning; it is not a pretty read. Or women hiding their identity online as a female, which is a safety measure that many women take. The most recent threat against female gamer Anita Sarkeesian was in Logan, Utah on October 15, 2014. She was scheduled to deliver a speech on a Wednesday evening until an anonymous email message arrived a day before, stating that there would be the deadliest school shooting in American history if the event was held. So don’t kid yourself that this isn’t real, and the impact means that many women are excluded from feeling that they can enter competitions in gaming.

Be Insights Driven, not Data Driven, and definitely not Tool Driven

When you analyze data, you bring your own personality and insights to analyzing data. I don’t believe that tools can solve problems; I believe that we have a lot of data, but no insight, information or wisdom unless we do something with the data. I don’t like the phrase data-driven; I prefer insights-driven. Qlik, Tableau or Power BI aren’t going to solve problems for you; they will just display data that hopefully brings about insights. The insights are yours and you can use each tool equally badly if you don’t have a story or a thread, or the data isn’t provoking an insight. We were all given the same data but we got very different results. That wasn’t down to the tool; it was down to the person driving the tool.

What I thought of the Tableau piece

I liked what Chris Love ( LinkedIn ¦ Twitter ) did; he clearly knows his stuff and it was nice to meet him in person. Funnily enough, we used to have the same boss when I had a boss (hello Tom Brown!).

I did find Chris Love’s visualization more interesting because he honed in on the journey of one player from the starting point to his success in winning a lot of money, and the journey was well displayed in Tableau. Chris had a good balance of context and detail, and for me, this was the data story telling piece. Here is the image below, credit to Laura Sandford:

 ChrisLoveTableau.jpg_large

What I thought of the Qlik piece

Nick Blewden ( LinkedIn, not on Twitter) is obviously fantastic at Qlikview and he did a good job of showcasing it. To be honest, I felt out of my depth here since it was a whizz tour of the product but since I’m not familiar with Qlik, I felt a bit bludgeoned with chart after chart and I couldn’t see a clear thread; it was information load as I tried to pick up the Qlik lingo as well as follow the story. I understand that Nick’s segment wasn’t aimed at beginners in Qlik and that’s ok with me; he only had five minutes to showcase what he’d done and he did a great job.

I am not familiar with the Qlik product set but a lot of the audience clearly were, and I could hear lots of mutters about ‘good to see he’s showing that feature’. So my perspective here is that of someone who does not know Qlik but who has expertise in Tableau and Power BI.  I can look at Qlik another time, if I choose.

I felt I’d let Power BI down at that point because I had not gone down that route of doing a product demo and I feel really bad about that. I had gone for the analytics and insights part because I’d understood the rules that way, and the audience can see a Power BI demo anytime they like.

One reason for me to present at the event was that I’d seen it as an opportunity to learn more about Qlik from the session, but all I saw was chart after chart. For me, there were lots of business intelligence dashboard and that’s fine and I think that it was a good product demo.

So my lasting takeaway from the Qlik segment is this dashboard was interesting because it showed that it was quick to produce a lot of charts very quickly, but sometimes ‘less is more’. I’m a fan of Stephen Few and he talks about the importance of finding the signal in the noise, and having a ton of charts can simply mean more noise if they are not meaningful. Here is the image below, credit to Laura Sandford:

NickBlewdenQlik.jpg_large

What I’d like to see next

I think I’d have preferred a larger, more mixed audience. A lot of people seemed to know one another already and I only knew one person in the audience. I’m not part of that community and it was nice to meet new people at the end.

Honestly, I’m not a fan of being shouted at by men I don’t know; it is really unpleasant. I think that the audience members should have the courtesy to refrain from shouting out during the performance. I was really put off with people shouting ‘Come on Qlik!’ and ‘Come on Tableau’ during the event. I didn’t hear a single voice for Power BI, not that it mattered; it really disrupted my thought flows to have people shouting when you’re trying to analyse data and I found it unsettling. Being at a live event isn’t like Gogglebox where the presenters can’t hear you.

So what did I think about the Power BI vs Tableau vs Qlik debate?

So what were my takeaways?

My call to action: be Insights Driven, not Data Driven, and definitely not Tool Driven.

According to the Qlik website ‘Deliver automated insight suggestions that help users see their data in new ways, auto-generating and prioritizing analytics and insights based on the overall data set and a user’s search criteria.’ Demystifying the marketing, it seems as if this means producing a ton of charts really quickly and if that’s what you’re looking for, it certainly did that. My overriding thought was that it can produce lots of charts but I really want to find meaning in charts, and I don’t measure meaning in charts by having as many charts as possible. I just got lost and I actually don’t think that’s good for Qlik.

For Tableau, I’d like to see Tableau become a real enterprise tool and it still feels like a cog in an enterprise wheel to me. I would not do any data prep in Tableau Prep although I do have experience in it; I’d want to use Power BI dataflows to clean data so that the data and the dataflows become part of the enterprise ecosystem.

I build big systems and I need to think big. When I’ve been working for customers, I’ve found it is easier to show ROI with Tableau and Power BI but it has taken longer for people to realize ROI with Qlik.

I am eternally confused by licensing and I find Tableau’s licensing simpler; Power BI and Qlik seem to be way more confusing to me. For Power BI, I always refer the customer back to their Microsoft reseller because they can figure it out for them.

My Power BI dashboard is here, for those of you who want to play with it:

https://app.powerbi.com/view?r=eyJrIjoiOTNmMzUyODAtZGFjZC00OTUxLWIxMmQtMDYzMTA5OWU1OGRkIiwidCI6ImFmMTA4OTMyLTkxNmQtNGUwNi1hZjVmLTAyMzg0NjZiZWRiMCIsImMiOjh9

Call to Action

I have put links, credits and sources here in case you want to play with the data.

Power BI Functionality

Colour Palette Used

Teal
#066082
#068
teal
hsl(196,91,26)
rgb(6,96,130)
Orchid
#b12acf
#b3d
darkorchid
hsl(289,66,48)
rgb(177,42,207)
Sandybrown
#fed044
#fd4
sandybrown
hsl(45,98,63)
rgb(254,208,68)

References

Distribution of computer and video gamers in the United States from 2006 to 2018, by gender. Source: Statista 
Why aren’t there more women in eSports?

 

Fun DataDive with DataKind UK

This weekend, I volunteered with DataKind UK on their Summer DataDive, which took place on the weekend of 28th and 29th July 2018 in the Pivotal London offices in Shoreditch. I had a fantastic, memorable weekend, mixing with around 200 other data scientists.

I’d like to thank the DataKind team for being so inspirational, giving, and kind with their time and skills. I’d like to emphasise my absolute admiration for the Data Ambassadors and the work that they do to lift everyone up.

Why did I do this? DataKind appealed to me since it meant that I could sharpen my data science skills  by pitching in with experts. New learners to Data Science are welcome, and there were also newbies who had some experience of data and wanted to know more. There was room for everyone to contribute, so if you are a newbie, it would be a great way to join in the conversations and learn from experts who love what they can achieve with data. Plus, it’s a great opportunity to mix with real data scientists. This isn’t Poundland data science, and this is not pseudo Data Science. This is the real thing; and I spent two days immersed in real problems using Data Science as a solution. I learned a lot, and I contributed as well. There is a saying that you are the average of your friends, and I needed to get close to more Data Scientists so that I could build on my earlier experience on AI and bring it up-to-date.

I wanted to help a charity, by dedicating my time and skills, to support women and girls who need it. I understand that there are vulnerable men too; but this isn’t about whataboutism. Women and girls are disproportionally affected by issues such as domestic violence and being the victims of sexual crimes, and I wanted to do something practical to help.

Lancashire Women's Centres LogoFor my specific contribution, I was working with a team of 25 other data scientists, we worked on finding insights in data belonging to Lancashire Women’s Centre. The vision of Lancashire Women’s Centre is that all women and girls in Lancashire are valued and treated as equals. Their aim is to empower women and girls to be able to transform their lives by bringing them together to find their voice, share experiences and understanding, develop their knowledge and skills, challenge stereotypes and misconceptions about them so that they can have choices in becoming the individuals they want to be. I share this conviction deeply and I wanted to help.

You may well be thinking that the charity help a small number of women, but that’s not the case at all. They have a real impact in their community. The Lancashire Women’s Centre has helped over 3000 women in the last year. This includes 5807 hours of therapeutic support were accessed by 1154 women and 78 men.  Following therapy: 25% were no longer taking medication, 8% felt the support had helped them find and keep a job, 12% continued to access LWC services to support their recovery.

So what did I do? I can’t share specific details because the data is confidential, and it obviously impacts some of the UK’s most vulnerable women and girls. I will say that the tools used were CoCalc, R, Python, Excel and Tableau and Power BI to work with the data.

DataKind™ brings high-impact organizations dedicated to solving the world’s biggest challenges together with leading data scientists to improve the quality of, access to and understanding of data in the social sector. This leads to better decision-making and greater social impact. Launched in 2011, DataKind leads a community of passionate data scientists, visionary partners and mission-driven organizations with the talent, commitment and energy to use data science in the service of humanity. DataKind is headquartered in New York City and has Chapters in Bangalore, Dublin, San Francisco, Singapore, the UK and Washington DC. More information on DataKind, our programs and our partners can be found on their website: www.datakind.org

Lancashire Women’s Centre

DataKind JenStirrup and Team

I’m the one on the right, wearing orange!

I’m looking forward to the next one!

Tableau Prep, Power Query and Power BI – Good together?

A question I often here is this: Which tool should I use, Tableau or Power BI? The truth is: They are not mutually exclusive.

Tableau is great at business mysteries: ill-defined questions where you have to surf the data for results. Power BI is particularly great at modelling and cleaning the data, with clean, crisp data visualisation and the ability to use custom and open-source data visualizations. This blog isn’t aimed at the technical user, but at the analyst who needs to get information out quickly. I will do another post, aimed at the geeks, another time.

Tableau and Power BI are paintbrushes for your data. The tools do not have to be mutually exclusive. Power BI contains some superb data preparation functions which are aimed at business users. Speaking to customers, however, it’s clear that they aren’t aware of its functionalities. So, I decided to make your life easier for you by helping you to compare the two, using the same data with the same result.

In the first video, we will look at Tableau Prep in some detail. We will use one of the Tableau datasets, Superstore, and we will work through one of Tableau’s own tutorials.
In the next segment, I repeat the exercise using Power BI and Power Query so that you can compare more easily. Ultimately, both tools achieve the same ends.

Where Tableau Prep falls down, in my opinion, is that Tableau Prep does not handle complex pivots very well. In the World Data Bank data, the Life Expectancy data which was made famous by Hans Rosling is available, and this data needs pivoted in order to be visualized effectively. Tableau needs a lot of branching pivots to get it to work. Power BI, on the other hand, pivots it within a few clicks. What do we take away from this?

  • Tableau Prep is great for simple data preparation tasks
  • Power Query, also known as Get and Transform in Excel, is great at simple and much more complex and difficult data preparation tasks.

So, which tool you use will depend on the data prep that you need to do. If it is easy, use Tableau Prep. If it is anything above easy, or includes easy, use Power Query.

You can use Power BI to shape the data, and then use the data in Tableau. See? You don’t have to choose. Select the best paintbrush for what you need to do, and you are not restricted to one paintbrush.

To see the videos, go here:

Tableau Prep: An Overview

 

Power Query and Power BI Together for Tableau Users:

 

 

How do you choose the right data visualisation in Power BI to show your data?

How do you choose the right visualisation to show your data? Usually the customer wants one thing, the business user want something else, the business sponsor wants something flashy…. and it’s hard to tease out the requirements, and that’s before you’ve even opened up Power BI such as Power View, Excel, Tableau or whatever your preferred data visualisation software.

In other words, there are simply too many charts to choose from, and too many requirements to meet. Where do you start?

I found this fantastic diagram which can help you to choose the right visualisation. I’m often surprised to see that people haven’t seen this before. Note: this diagram was done by Andrew Abela of Extreme Presentation and the source is here and his email address is on the slide, so be sure to thank him if you’ve found it useful. If you can’t see it very well, click here to go to the source.

choosing-a-good-chart-09_001

Chart Choosers should not replace common sense, however, and Naomi Robbins has written a nice piece here which is aimed at the wary. However, diagrams like Abela’s can really help a novice to get started, and for that, I’d like to thank him for his work.

How does it related to Microsoft’s Power BI? If you look at the visualisations that are available in Power View, you can see that most of the visualisations in the diagram are available in Power BI.  The ones that are excluded are the 3D graphs, circular area charts, variable width charts, or the waterfall chart.

Why no 3D? I personally hope that Microsoft will leave 3D out of Power BI tools, unless of course it is in Power Map.  With 3D on a chart, it is harder to identify the endpoints, and it can take us longer. It might also mean that points are occluded. If you’re interested and want to see examples, here is one by the Consultant Journal team or you can go ahead and read Stephen Few’s work. If you haven’t read anything by Stephen Few, get yourself over to his site right now. You won’t regret it. Why is it different from Power Map? 3D maps provide context, and they are the exception where I will use 3D for a data visualisation showing business data. I’m obviously excluding other types of non-business data here, such as medical imaging and so on.

Why no circular area or variable width charts? I am not a fan of variable width of circular area because we aren’t very good at evaluating area when we look at charts and graphs, and Robert Kosara has an old-but-good post on this topic here.

This blog is mainly for me to remember stuff but I hope it helps someone out there too.

Best Wishes,
Jen

Data Visualisation: lifting the curse of Cassandra

CassandraInformation is the new currency, the lifeblood of organisations. However, it has to be explored, and evangelised throughout the organisation before it can have any real impact. SQL Server 2012 now helps business users to access the data; a real paradigm shift in the ‘umbrella’ of users who touch SQL Server.
However, does that mean that the users will be believed? The ‘messenger’ of the information can have a great influence on how the information is – or is not – propagated throughout the organisation. Sometimes, people are simply not believed, or their ideas entertained. This may be due to the way that they put the message across, or simply due to the fact that they can’t get the message to the right people without upsetting the apple cart. 
This is about the person (or group) in the organisation, who might meet one of these criteria:
  • see a ‘train crash’ going to happen in the organisation, but can’t should loudly enough to avert it happening.
  • see room for improvement in the business, but find it hard to get their message across
  • have a ‘gut feel’ about what customers are telling the enterprise, but find it hard to prove, demonstrate or research this ‘gut feel’

This leads to the Cassandra Complex.  Quick history lesson: according to Greek mythology, Cassandra was the beautiful daughter of King Priam and Queen Hecuba of Troy. She refused the advances of Apollo, who set a curse on her: that she would always tell the truth, but never be believed.
There are parallels with this mythological figure in the workplace, which may engender your sympathy or empathy. You might see this in yourself or in someone else. Do you see the ‘train crash’ in the organisation before it happens, but can’t get the message cross? Do you see patterns in the data, and find it hard to evangelise your findings throughout the organisation?
If so, you could be the ‘Cassandra’ in your organisation, or a customer or associated company, for example. It is tremendously frustrating to see issues in the organisation, but not get the message across. So, if you see someone banging their head against a wall, trying to show problems before they take hold: they do this because they care, but perhaps that isn’t the best way to get the message across.  It also helps business users to test out their theory by allowing the users to explore their findings properly, before publicising them.
A better way to get the message across is to research, demonstrate and uncover the findings in the data using data visualisation technology such as Power View, which can help. It is the new part of SQL Server 2012 which allows users to touch their data; it isn’t just about techies any more. By showing the ‘truth’ of the data, hopefully this would cure the curse of Cassandra: to be heard and also to be believed. Visualising data can bring insights, and attention, into data that can show where the problems reside in the organisation.
Data Visualisation can help to make Cassandra speak – and be believed. 
Sometimes people need to ‘see’ the problem before they understand it. Data Visualisation makes the insights accessible. It’s harder to ignore Cassandra if the the message is shown inescapably to all, particularly when it’s right in our favourite Office tools such as PowerPoint or Excel.
Making data insights accessible means that data visualisation are used to make analysing data simple, assuming the data is properly organized, cleansed and sanitised. The beauty of these solutions is that it’s fast to get results, and easy to show them off. If you’re interested in looking at data visualisation technology, then the Gartner Report is a good place to start. I tend to think that the technology should support the business requirements, within the budget set by the organisation for purchasing software. That’s why it’s difficult to recommend one over the other, since the answer is usually ‘it depends’…
Whether or not the organisation works on the insights, of course, is never guaranteed. As always, life isn’t that simple, but it’s a new angle that might help push problems forward and turn them into solutions. 

Tabular models and Tableau

I was recently asked how to connect Tableau to a Microsoft Tabular model. The concept itself is straightforward in Tableau. In my opinion, tabular models will become more prevalent, so I will start to look at them more detail.

It turns out that the individual who questioned me was struggling, unfortunately. The resolution was that he wasn’t including the instance name as part of the connection.

In order to help business users to connect Tableau to the tabular model, I have included a brief video on how to achieve connectivity between Tableau and the Tabular model.

The video does include my Scottish accent, so please feel free to turn down the volume!

The link to the video is here:

I’m blogging this on my iPad so please excuse that I haven’t inserted the video itself! I will do this when I am back online properly. I am on holiday just now, but thought it worthwhile just to get the information out to help the individual ASAP

Update 8th April: video inserted 🙂 enjoy

-Jen

SQLUniversity: PowerPivot, Tableau and Jedi Knights

This blog will show an overview of how I mobilised PowerPivot using Tableau. I’ve previously given this session at SQLBits, NEBytes Microsoft Technology User Group, and SQLServerDays in Belgium but thought it would also be useful to supply the files for you. The steps are very simple since I intended to show the end-to-end solution simply as a proof of concept, as follows:

  • creation of a PowerPivot which mashed up UK Census data with geographical data
  • creation of the report in Tableau
  • deployed to Tableau Public for consumption by mobile devices such as the iPad
The example was deliberately kept simple in order to prove the concept of PowerPivot being mobilised. 
The data sample involved mashing up two sources:
  • Jedi Knight census, data, which can be downloaded from here This is a basic file but the final PowerPivot can be downloaded from a link later on in this article
  • Geonames offer an excellent free download service, which you can access here
The Jedi Knight data, along with the geographical data, were joined using the outcode of the postcode data. If you need more definitions of the UK postcode system, I’ve previously blogged about this here.  
Essentially, a very simple RELATED formula was used in order to look up the latitude and longitude from the UKGeography table, and put it into the Jedi Knights data, and produce the necessary data in a simple Excel table. The formula looks like this:
=RELATED(UKGeography[Latitude])
=RELATED(UKGeography[Longitude])
Once these very simple formula were put in place, it was time to load the data into Tableau.
Tableau can take both PowerPivot and Excel data – which driver to use?  I used version 6 of Tableau. Whilst this version of Tableau does see the PowerPivot correctly as an Analysis Services cube, it does not always read the date as a ‘date’ type, but instead as an attribute. There is a forum posting on the Tableau website which tells you how to fix this issue, which involves changing the date so it appears as a measure, which means it can then be used for trends and so on. 
However, I wasn’t comfortable with this solution because I like dates to be in date format. I’ve also run into this issue at customer site, where the customer wanted to use SSAS as a source and Tableau as the presentation layer. They were data-warehouse savvy and didn’t like the ‘measures approach’ fix. 
On customer site, I got around it instead by using the Excel data source, and importing all of the PowerPivot columns into an Excel 2010 sheet. By doing it in this way, date formats were preserved. In this example, I didn’t have date format so it didn’t matter – but this is a useful tip for the future if you are using PowerPivot with Tableau. The final data, in an Excel PowerPivot, can be obtained in zip format here or if you can’t access it, please email me at jenstirrup [at] jenstirrup [dot] com.
Once the data was accessible by Tableau, I used the Tableau Desktop version to upload the data into Tableau’s memory. I did this so that I could eventually upload the Tableau workbook to Tableau Public. The instructions to save to Tableau Public are given here
Once the data was in Tableau Public, I just needed to access the data using the Safari browser on the iPad. In case you are interested, the demos are publically accessible and you can access the final result by clicking on the hyperlinks below.
I hope that’s been a useful overview of PowerPivot, and the ease of which it was mobilised. This blog forms a use case of how it might be useful to use PowerPivot, since I think that people sometimes need examples of how PowerPivot can benefit them. In this case, the clear benefit of PowerPivot is to provide an easy way of mashing up different data sources.
I look forward to your comments and thank you for sticking with me for the PowerPivot SQLUniversity discussions!