I had an interesting conversation with one of my customers. Through my company Data Relish, I have been leading the Data Science program for some time now, and I was using Team Data Science Process as a backbone to my leadership. I feel I’m fighting the good fight for data, and I like to involve others through the process. It’s great to watch people grow, and get real insights and digital transformation improvements based on these insights.
Data science projects are hard, though, and it’s all about expectations. In this case, my customer was curious to know why the current data science project took longer than he expected, and shouldn’t they just exclude the business understanding part of the data science journey? Couldn’t the analytics just clean themselves, or just cut out every piece of data that was a problem?
Being data-driven is all very well, but we need to be open to the insights from business expertise, too.
When the conversation continued, it became clear that a different data organization had been involved in conversations at some point. Apparently, another organization had told my customer that they needed Data Analytics rather than Data Science, and that the two were mutually exclusive. Data Analytics would give them the insights without involving much, if any, business knowledge, effort, or time. What my customer understood from them was that they didn’t need to match data, clean it and so on; data analytics simply meant analysing columns and rows of data in order to see what relationships and patterns could be found in the data. In essence, the customer should divorce business knowledge from the data, and the data should be analyzed in isolation. The business and the data were regarded as mutually exclusive, and the business side should be silenced in order to let the data speak. Due to these conversations, the customer was concerned about the length of time of the project was taking, and wanted to go down the ‘data analytics’ route, mix up columns, skip data cleaning and matching sources, and he was absolutely certain that insights would fall out of the data. To summarise, there were a few things behind the conversation:
- business people are concerned about the time taken to do a data science project. They are essentially misled by their experience of Excel; they believe it should be as straightforward and quick as generating a chart in Excel.
- business people can be easily misdirected by the findings as a result of the data science process, but without being critical about the results themselves. It seems to be enough that a data science project was done; but not that it was right. The fact it is a data science project at all is somehow ‘good enough’.
- business people can be easily swayed by the terminology. One person said that they were going into decision science, but couldn’t articulate properly what it was, in comparison to data science. That’s another blog for another day, but it’s clear that the terminology is being bandied around and people are not always understanding, defining or delineating what the terms actually mean.
- business people can equate certainty with doing statistics; they may say that they don’t expect 100% findings, but, in practice, that can go out of the window when the project is underway.
The thing is, this isn’t the first time I’ve had this conversation. I think that being data driven is somewhat misleading, although I do admit to using the term myself; it is very hashtaggable, after all. I think a better phrase is insights driven. If we remove the business interpretation and just throw in data, we can’t be sure if the findings are reasonable. As I responded during this conversation, if we put garbage in, we get garbage out. This is a stock phrase in business intelligence and data warehousing, and it also applies to data science. There needs to be a balance; we can be open to new ideas. Our business subject matter expertise can help to shortcut the project by working with the data – not against the data. It helps to avoid the potential of going down rabbitholes because the data said so. The insights from the business can help to make the stories in the data more clear, whilst being open to new insights from the data.
In other words, data and the business should not be mutually exclusive.
How did it end? We proceeded as normal, and as per the Data Science plan I’d put in place. Fortunately, there were strong voices from the business, who wanted to be included at all stages. I think that we are getting farther, faster, as a unified team, all moving in the same direction. We need to question. Data Science is like April Fools’ Day, every day; don’t believe everything you read. Otherwise, we will never see the wood for the trees.