Future Decoded – I’m speaking on machine Learning

1024x600-Speaker-briancox

I’m speaking at Future Decoded, at the same event as Professor Brian Cox, Sir Nigel Shadbolt (Co-founder & Chairman, ODI Open Data Institute), Or Arbel, (CEO, Yo), Michael Taylor (IT Director, Lotus F1 Team), Kenji Takeda, Microsoft Research, and my good friends Chris Webb and James Rowland-Jones

To Register, here’s the link http://www.microsoft.com/en-gb/about/future-decoded-techday

 

 

Jen’s Diary: Figuring it out this week, the Rubik’s Cube of data

“Knowing a great deal is not the same as being smart; intelligence is not information alone but also judgment, the manner in which information is collected and used.” Carl Sagan

The endless cycle of idea and action,

Endless invention, endless experiment,

Brings knowledge of motion, but not of stillness;

Knowledge of speech, but not of silence;

Knowledge of words, and ignorance of the Word.

Where is the Life we have lost in living?

Where is the wisdom we have lost in knowledge?

Where is the knowledge we have lost in information?

The cycles of Heaven in twenty centuries

Bring us farther from GOD and nearer to the Dust.

The Rock, TS Eliot

As you will know from the official PASS post from PASS President Thomas LaRock, PASS are looking to build a bigger umbrella for the data professional. This week, as well as a bunch of VC stuff, I’ve been looking at PASS BA conference.

As always, I do not speak for PASS and these are purely my personal thoughts. If you want to comment, provide feedback or criticise, even, then this should be directed towards me. I’m happy to answer any questions as soon as I am able. My email is jen.stirrup@copper-blue.com so fire away.

Why did I volunteer to help with PASS BA? Why do I care and why do I spend my time on it? For those of you do know the Insights program, I come up as a blue/green person. Basically this means that I’m detailed and I care, and I reflect before I act. The downside is that it looks like I’m not doing anything because I’m chewing over the facts. To others, this is probably quite frustrating because there’s no visible output. However, once I’ve come to what I believe is the correct conclusion, I act because the facts and the data give me confidence to act. This blog is to help folks to see what I’m chewing on, and to help remove the assumption that I don’t care and I’m not doing anything. I’m looking at the data, so I know I’m doing the right things – just not quickly because it’s worth taking the time.

I think it’s important to have a data-based, fact based look at the business analytics sphere generally. What does the industry say about where the industry is going? What does the data say? We can then look at how PASS fits in with this direction (in my personal opinion, note).

Data is part of the endless cycle of invention, but to do that, we have to look at its many faces. The trends in the industry are changing. For example, IDC recently estimated that our digital universe will double every two years.  IDC also estimates that by 2020, as much as 33% of the digital universe will contain information that might be valuable if analyzed, compared with 25% today. That’s a lot of data, and where is it coming from? People. Data touches our every day; in fact, research by Fusion IO shows we touch 9 databases each day before breakfast.

The face of data is also changing. FICO’s research shows that unstructured data represents 80% of all data today. And the amount of unstructured data is expected to continue growing by 80% annually – from social media, email, customer service calls, even imagery. Data promises to solve many problems, but it’s not always clear how the data buried in unstructured and structured data can directly improve predictions and decisions. Also, people need data that they can understand, and perhaps even turn it into data visualisations, then into decisions, and then into more questions. This is a never ending cycle

Businesses will start to look to use their data by analysing both structured and unstructured data. This doesn’t mean that structured data is going away. It is still valuable. It is a vital piece of the data jigsaw, a side of the Rubik’s cube of data. However, where are the people that are going to service this skill? Dell predicts that the business analyst role will increase by 22% by the year 2020. There’s a skill shortage, right there, and it is going to get worse. SAS have already said that  the skills shortage is the biggest problem in analytics right now. Excel is a big part of that story since it is the third most popular button in BI tools. I love this post of Rob Collie’s on this topic, and I believe it’s true, based on my own BI consulting experience.

Are people interested in analytics enough to build a career on it? Well, some of these people are already coming to PASS for this information; they’ve seen PASS help their DBA and BI folks, and they’re joining in the fun too.  Closer to PASS, I’m running SQLSaturday London BA Edition with fantastic help by Bob Phillips and I can say that the event wouldn’t have its current shape if it wasn’t for Bob’s input, and I’d like to thank him here for his support, and his courage in telling me what I need to know, rather than what I’d like to hear – I need that!. Due to this, I’ve seen the PASS London BA was oversubscribed a few weeks ago and the waiting list is still growing. the new Excel BI VC led by the (frankly amazing!) Jen Underwood is growing at a rate and has attendances which compete with more established VCs – and Jen’s only been running it for a few months. She’s done an amazing job in meeting a clear need for Excel information, and they’re voting with their feet by taking the time to participate in the Excel sessions. The next VC session is with Mr Excel – Bill Jelen – and I recommend that you register for Bill Jelen’s session sooner rather than later. The BA VC is steadily growing in numbers and Dan English and Paras Doshi won the Outstanding Volunteer Award for September for their efforts in the BA VC, and their ground-breaking work for getting a very professional YouTube site up. The BI VC, under the leadership of Julie Koesmarno ( also frankly amazing! ) has seen an increase of thousands of members this year alone. The other VCs are following suit and we are reaching new PASS community and audience members all the time. The data is there, and I look forward to sharing these incredible achievements with the VC Leaders at PASS Summit first – after all, they are the ones who make the magic happen – and then we will have more details on the growth overall. All I will say for now is that the VC leaders and co-leaders are a fantastic success story in educating people globally and we should all heart the VCs for their passion in making that happen.

If anyone can find any evidence to suggest that we aren’t experiencing a growth in data, or in our need to analyse it, I would be glad to see it so please feel free to post it as a comment below. I haven’t been able to find any. If, based on the evidence, we assume that there is a growing need to analyse data, and a growing need for a skill set to match, where does PASS fit in to all this? What ties it all together?

Data.

In my own opinion, I see ‘data’ as the ‘connective tissue’ that binds together. With respect to the Carl Sagan quote enough, I see a parallel with data. It’s not enough to have a lot of data, a ton of data, protected. Collecting it isn’t enough. It needs to be used and loved, in my opinion. I don’t like unloved data. To get intelligence, we need judgement, and we get the judgement from the facts and data. However, we can arrive to better, data-based judgements once we’ve had the opportunity to analyse and process the data.

Normally when I organise an event like PASS SQLSaturday Edinburgh or London, or I have helped out at SQLRelay or SQLSanta, you have a headline in your head – Who is the audience? For an event like SQLRelay or SQLSanta, this is fairly straightforward. You might say, well,  40% DBA, 40% BI, 40% BA, for example. However, defining a ‘business analyst’ is hard, and I can see this in the data that I have for SQLSaturday London. It isn’t easy to do broad brushstrokes on an audience like this, because only some of them well call themselves ‘business analyst’ or ‘data analyst’. The reason for this is that they tend to define themselves by knowledge or business domain rather than by technology. This is more intangible and difficult to measure and define. So you tend to see job titles like ‘economist’ or ‘accountant’ or ‘finance analyst’ or something like that. This isn’t the neat 40/40/20 split we saw before, but it is tied with the data. Data binds us.

Why might someone need someone like this on the team? What do they contribute? There are many reasons, business reasons, why you might want to do that. You need someone to keep asking and answering those data questions to keep up with your competitors, and understand what your business should do and what it is not doing. Businesses need keep an eye on efficiency. The business analyst uses data analysis to come up with ideas for boosting profits or reducing expenses or finding insights or suggesting actions – or all of these, and that’s just a start. Your business analyst may not have hard technical skills, but he or she should be able to understand the technology well enough to communicate and negotiate with company leaders/business influencers. They may not call themselves a business analyst. Basically, it’s the person in your organisation who uses data and works with data to make a decision. If you’re a DBA, they probably bug you for data.

It’s my hope that PASS supports the data professional, whether they are engine focused or Power View users or decision makers who get their SQL data in Excel, and need to make data go faster, secure data, guard, sanitise, integrate, disaster-proof, recover, manage, share data, visualise data, munge it, model it, make decisions and insights on the data. We need to do stuff with data. As David McCandless said, data is the new soil. I think he probably meant it as something to be ‘mined’ but I see it data as something to sift, with lumps, pebbles and sharp bits to remove and worms to avoid. This data may not even be useful. But we won’t know until we till it.

I see the community as a ‘Rubik’s cube’ of skill sets; different faces, distinct but connected by the same thing. It’s my hope that PASS has the flexibility to be a true champion for the spectrum of data community folks, and can give people in data a ‘voice’.  In becoming encouraging engagement and dialogue from different perspectives, PASS can help people on either side of the data ‘coin’ or perhaps different sides of the data ‘Rubik’s Cube’, but merging together to constitute a interdependent whole.  We hear a lot of about the IT and business disconnect, and sometimes communication and a common ‘canvas’ for understanding can make people’s lives so much easier. That, of course, applies to many things.

So where does the engine-focused DBA fit into all this? Well, the analysts, the data ‘janitors’ couldn’t do their jobs without the DBA or the ‘data guardians’ who protect the data, regardless of whether it is in SQL Server or Hadoop, for example.

The world is moving on to include data that is not rectangular in shape, and includes data from sensors, devices and so on. With it, we’re going to need the skill set to do something with the non-rectangular data. This doesn’t mean that rectangular data isn’t important and organisations wouldn’t function without it. However, the world of data is moving, and it is moving fast. In my opinion, PASS is uniquely placed to help build the skills for the new world of data, Data 2.0. It doesn’t mean that there won’t build a future for rectangular data as well; it hasn’t stopped being important.

For me personally, I want to give a voice and a home for the folks who have to do stuff with the data. Data is hard. It’s really hard to know where to begin with all this rectangular data, never mind the non-rectangular data. I speak to DBAs regularly who say that they are the DBA in their office, but they sometimes get asked to do something with the data because they’re the SQL Server expert in the office. Sometimes bosses don’t understand that SQL Server is made up of lots of different bits with different functions. For me, SQL Server 2012 was actually a shift whereby business users were included in the world of SQL Server for the first time. I see it as a spectrum which fits across the business, and there’s a place for everyone. By encompassing this, it accommodates the people who are DBAs-with-Accidental-BI skills, opens up the doors for people to reskill into the cloud if their company requires it, and go into the foray of dealing with non-rectangular as well as rectangular data if they wish.

My life experiences have taught me that there is nothing worse in the world than having no options. It’s one reason that I advocate diversity in technology: it gives people the opportunity to have a ‘home’. In the same way, my personal hope is that PASS can help people to see options in their careers by offering education about different things, as well as the option to become more specialized and expert at what they already do. We have to start somewhere, and the data helps us to take a start. In the words of Mary Poppins, well-begun is half-done, and looking at the data is a well-begun start.

Again, I do not represent PASS and these are simply my personal thoughts. Please feel free to email me at jen.stirrup@copper-blue.com and I will be pleased to answer. Alternatively, please feel free to leave a comment.