Guess who is appearing in Joseph Sirosh’s PASS Keynote?

This girl! I am super excited and please allow me to have one little SQUUEEEEEEE! before I tell you what’s happening. Now, this is a lifetime achievement for me, and I cannot begin to tell you how absolutely and deeply honoured I am. I am still in shock!

I am working really hard on my demo and….. I am not going to tell you what it is. You’ll have to watch it. Ok, enough about me and all I’ll say is two things: it’s something that’s never been done at PASS Summit before and secondly, watch the keynote because there may be some discussion about….. I can’t tell you what… only that, it’s a must-watch, must-see, must do keynote event.

We are in a new world of Data and Joseph Sirosh and the team are leading the way. Watching the keynote will mean that you get the news as it happens, and it will help you to keep up with the changes. I do have some news about Dr David DeWitt’s Day Two keynote… so keep watching this space. Today I’d like to talk about the Day One keynote with the brilliant Joseph Sirosh, CVP of Microsoft’s Data Group.

Now, if you haven’t seen Joseph Sirosh present before, then you should. I’ve put some of his earlier sessions here and I recommend that you watch them.

Ignite Conference Session

MLDS Atlanta 2016 Keynote

I hear you asking… what am I doing in it? I’m keeping it a surprise! Well, if you read my earlier blog, you’ll know I transitioned from Artificial Intelligence into Business Intelligence and now I do a hybrid of AI and BI. As a Business Intelligence professional, my customers will ask me for advice when they can’t get the data that they want. Over the past few years, the ‘answer’ to their question has gone far, far beyond the usual on-premise SQL Server, Analysis Services, SSRS combo.

We are now in a new world of data. Join in the fun!

Customers sense that there is a new world of data. The ‘answer’ to the question Can you please help me with my data?‘ is complex, varied and it’s very much aimed at cost sensitivities, too. Often, customers struggle with data because they now have a Big Data problem, or a storage problem, or a data visualisation access problem. Azure is very neat because it can cope with all of these issues. Now, my projects are Business Intelligence and Business Analytics projects… but they are also ‘move data to the cloud’ projects in disguise, and that’s in response to the customer need. So if you are Business Intelligence professional, get enthusiastic about the cloud because it really empowers you with a new generation of exciting things you can do to please your users and data consumers.

As a BI or an analytics professional, cloud makes data more interesting and exciting. It means you can have a lot more data, in more shapes and sizes and access it in different ways. It also means that you can focus on what you are good at, and make your data estate even more interesting by augmenting it with cool features in Azure. For example, you could add in more exciting things such as Apache Tika library as a worker role in Azure to crack through PDFs and do interesting things with the data in there. If you bring it into SSIS, then you can tear it up and down again when you don’t need it.

I’d go as far as to say that, if you are in Business Intelligence at the moment, you will need to learn about cloud sooner or later. Eventually, you’re going to run into Big Data issues. Alternatively, your end consumers are going to want their data on a mobile device, and you will want easy solutions to deliver it to them. Customers are interested in analytics and the new world of data and you will need to hop on the Azure bus to be a part of it.

The truth is; Joseph Sirosh’s keynotes always contain amazing demos. (No pressure, Jen, no pressure….. ) Now, it’s important to note that these demos are not ‘smoke and mirrors’….

The future is here, now. You can have this technology too.

It doesn’t take much to get started, and it’s not too far removed from what you have in your organisation. AzureML and Power BI have literally hundreds of examples. I learned AzureML looking at the following book by Wee-Hyong Tok and others, so why not download a free book sample?

https://read.amazon.co.uk/kp/card?asin=B00MBL261W&preview=inline&linkCode=kpe&ref_=cm_sw_r_kb_dp_c54ayb2VHWST4

How do you proceed? Well, why not try a little homespun POC with some of your own data to learn about it, and then show your boss. I don’t know about you but I learn by breaking things, and I break things all the time when I’m  learning. You could download some Power BI workbooks, use the sample data and then try to recreate them, for example. Or, why not look at the community R Gallery and try to play with the scripts. you broke something? no problem! Just download a fresh copy and try again. You’ll get further next time.

I hope to see you at the PASS keynote! To register, click here: http://www.sqlpass.org/summit/2016/Sessions/Keynotes.aspx 

Jen’s Diary: Why are PASS doing Business Analytics at all?

As always, I don’t speak for PASS. This is a braindump from the heart. I realise that we haven’t communicated about BA as much as some members might like. It’s a hard balance – I don’t want to spam people, and I don’t want to get it too light, either. If you want to sign up for PASS BA news, here’s the link. So I have to apologise here, and hold my hands up for that one. I’ll endeavour to ensure we have a better BA communications plan in place, and i’m meeting the team on Friday to discuss how we can make that happen.

In the meantime, I’d like to blog about BA today. How did we get here, and where are we going? Why are PASS interested in Business Analytics at all? To answer this question, let’s look at the history of Business Intelligence, what Business Analytics means, and how PASS can be part of the story. Let’s start with the history lesson. What are the stages of Business Intelligence?

First generation Business Intelligence – this was the world of corporate Business Intelligence. You’ll know this by the phrase ‘the single source of truth’. This was a very technical discipline, focused on the data warehouse. It was dominated by Kimball methodology, or Imon methodology, dependent on the business requirement. However, the business got lost in all this somewhere, and they reverted to the default position of using Excel as a tool to work with Excel exports, and subverting the IT departments by storing data in email. Microsoft did – and still do – cater for the first generation of business intelligence. It has diversified into new cloud products, of course, but SQL Server still rocks. You’ll have seen that Gartner identified SQL Server as the number one RDBMS for 2015. Kudos to the team! For an overview, the Computer Weekly article is interesting.

Second generation Business Intelligence – the industry pivoted to bring the Business back into Business Intelligence. You’ll know this by the phrase ‘self-service business intelligence’. Here, the business user was serviced with clean data sources that they could mash and merge together, and they were empowered to connect to these sources. In the Microsoft sphere, this involved a proliferation of tabular models, PowerPivot as well as continued use of analysis services multidimensional models. As before, Excel remained the default position for working with data. PASS Summit 2015 has a lot of content in both of these areas.

So far, so good. PASS serves a community need by offering high quality, community education on all of these technologies. Sorted, right?

Wrong. The world of data keeps moving. Let’s look at the projected growth of Big Data by Forbes.

Well, the world of business intelligence isn’t over yet; we now have business analytics on the horizon and the world of data is changing fast. We need to keep up! But what do we do with all this data? This is the realm of Business Analytics, and why is it different from BI? The value of business analytics lies in its ability to deliver better outcomes. It’s a different perspective. Note from our first generation and our second generation BI times, technology was at the forefront of the discussion. In business analytics, we talk about organizational change, enabled by technology. In this sphere, we have to quantify and communicate value as the outcome, not the technology as a means to get there. So what comes next?

Third generation of business intelligence – self-service analytics. Data visualisation software has been at the forefront of second generation Business Intelligence, and it has taken a priority. Here, the position is taken that businesses will understand that they need data visualisation technologies as well as analytical tools, to use the data for different purposes.

How is Business Analytics an extension of Business Intelligence? Let’s look at some basic business questions, and see how they fall as BI or BA. Images belong to Gartner so all kudos and copyright to the team over there.

What happened?

If the promise of business intelligence is to be believed, then we have our clean data sources, and we can describe the current state of the business. Gartner call this descriptive analytics, and it answers the question: What happened? This level is our bread-and-butter business intelligence, with an emphasis on the time frame until this current point in time.

Why did it happen?

We can also understand, to a degree, why we are where we are. This is called diagnostic analytics, and it can help pinpoint issues in the organisation. Business Intelligence is a great domain for understanding the organisation until this point in time. However, it’s a rearview impressio of the data. What happens next? Now, we start to get into the remit of Business Analytics:

What will happen?

Businesses want to know what will happen next. Gartner call this predictive analytics, and this perception occurs when we want to try and look for predictive patterns in the data. Once we understand what will happen next, what is the next question?

How can we make this happen?

This is the power of prescriptive analytics; it tells us what we should do, and it is the holy grail of analytics. It uses business intelligence data in order to understand the right path to take, and it builds on the other types of analytics.

Business Intelligence and Business Analytics are a continuum. Analytics is focused more on a forward motion of the data, and a focus on value. People talk about ROI, TCO, making good business decisions based on strong data. First generation and second generation are not going away. A cursory look around a lot of organisations will tell you that. The Third Generation, however, is where organisations start to struggle a bit. PASS can help folks navigate their way towards this new generation of data in the 21st century.

How do we measure value? It is not just about storing the data, protecting it and securing it. These DBA functions are extremely valuable and the business would not function without them – full stop.  So how do we take this data and use it as a way of moving the organisation? We can work with the existing data to improve it; understand and produce the right measures of return, profiling, or other benefits such as team work. Further, analytics is multi-disciplinary. It straddles the organisation, and it has side effects that you can’t see, immediately. This is ‘long term vision’ not ‘operational, reactive, here-and-now’. Analytics can effect change within the organisation, as the process of doing analytics itself means that the organization solves a business problem, which it then seeks to re-apply across different silos within the organization.

SQL Server, on the other hand, is a technology. It is an on-premise relational database technology, which is aimed at a very specific task. This is a different, technologically based perspective. The perspectives in data are changing, as this Gartner illustration taken from here shows:

Why do we need a separate event? We need to meet different people’s attitudes towards data. DBAs have a great attitude; protect, cherish, secure data. BAs also have a great attitude: use, mix, apply learnings from data. You could see BA as a ‘special interest group’ which offers people a different choice. There may not be enough of this material for them at PASS Summit, so they get their own event. If someone wants to go ahead and have a PASS SQLSaturday event which is ‘special interest’ and focuses solely on, say, performance or disaster recovery, for example, then I don’t personally have a problem with that.  I’d let them rock on with it. It might bring in new members, and it offers a more niche offering to people who may or may not attend PASS because they don’t feel that there’s enough specialised, in depth, hard-core down-to-the-metal disaster recovery material in there for them. Business Analytics is the same, by analogy. Hundreds and hundreds of people attended my 3 hour session on R last year; so there is an interest. I see the BA event as a ‘little sister’ to the PASS ‘big brother’ – related, but not quite the same.

Why Analytics in particular? It’s about PASS growth. To grow, it can be painful, and you take a risk. However, I want to be sure that PASS is still growing to meet future needs of the members, as well as attracting new members to the fold However, the feetfall we see at PASS BA, plus our industry-recognised expert speakers, tell us that we are growing in the right direction. Let’s take a look at our keynote speaker, Jer Thorpe, has done work with NASA, the MOMA in New York, he was Data artist in residence at the New York Times and he’s now set up. The Office for Creative Research & adjunct professor at ITP. Last year, we had Mico Yuk, who is author of Dataviz for Dummies, as well as heading up her own consultancy team over at BI Brainz. They are industry experts in their own right, and I’m delighted to add them as part of our growing PASS family who love data.

The PASS BA event also addresses the issue of new and emerging data leaders. How do you help drive your organisation towards becoming a data-oriented organisation? This means that you talk a new language; we talk about new criteria for measuring value, working out return on investment, cross-department communication, and communication of ideas, conclusions to people throughout the organisation, even at the C-level executives. PASS BA is also looking at the career trajectories of these people as well as DBA-oriented folks, and PASS BA is out there putting the ‘Professional’ aspect into the event. We have a separate track, Communicate and Lead, which is all about data leadership and professional development. A whole track – the little sister is smartly bringing the Professional back, folks, and it’s part of our hallmark.

PASS is part of this story of data in the 21st Century. The ‘little sister’ still adds value to the bigger PASS membership, and is an area of growth for the family of PASS.

Any questions, I’m at jen.stirrup@sqlpass.org or please do come to the Board Q&A and ask questions there. If you can’t make it, tweet me at jenstirrup and I’ll see if I can catch them during the Q&A.

My handy IoT Toolkit: What businesses forget about IoT

I recently did a brief blog post for Izenda on IoT and business intelligence, and this part of my IoT series expands on some of the themes there.

The Internet of Things is a new phenomenon; that said, a simple search for ‘Internet of Things IoT’ brings back over 60 million search results in Bing. What is the Internet of Things? The Internet of Things Global Standard gives us the following definition: ‘The Internet of Things (IoT) is defined as Recommendation ITU-T Y.2060 (06/2012) as a global infrastructure for the information society, enabling advanced services by interconnecting (physical and virtual) things based on existing and evolving interoperable information and communication technologies.

Now, this definition is fine but it focuses on the ‘shiny’ aspect of IoT and, most importantly, it does not mention the data aspect of IoT. It emphasises the connectedness of the various gadgets and their interoperability. I prefer Peter Hinssen’s discussion, where he recommends that we talk about the value of the network of things. The connected devices, on their own, will fulfil their purpose. However, if you want real insights from these sources, then you need to splice the data together with other  sources in order to get insights from it.

The thing is, the Internet of Things is really the Internet of You.

We are now heading towards the Zettabyte generation thanks to the Millennial generation. For example, the World Data Bank projects that 50% will have smartphone by 2016, and 80% by 2020. We sent 7.5 trillion SMS messages in 2014. In fact, one app, WhatsApp, sent 7.2 trillion messages. And that’s just data from one app. In 2000, Kodak processed 80 billion photos processed from camera film. In 2014, 800 billion photos from smartphones were shared on social networks. And that’s just the photos that were shared.

We are the Internet of Things.

From the business perspective, how do you make use of that IoT data? The consumerization of IT means that business users are often asked to manage and cleanse data, regardless of its size and nature. Research suggests that data is growing at a rate of 40% of each year into the next decade, driven by increased online activity and usage of smart devices. (ABI, 2013). (The New York Times, 2014 ). The consumerization of data means that business users should be able to access and analyze the data comfortably. When we introduce data that comes under the umbrella o the Internet of Things, business users will need to be able to access IoT data from devices as well as other data sources, to get insights from the data.

How can we harness the IoT phenomenon to understand and serve our customers better?

The addition of data from a variety of sources, including data from devices, means that IoT has a very wide scope and opportunity. IoT can focus on the devices themselves, or the network infrastructure connecting devices, or the analytics derived from the data which comes from the network and the devices. In order to get true insights, the IoT data would be deployed in tandem with other relevant data so that the business user obtains the context. The IoT would also introduce real time data, which would be mixed with historical data.

Customer expectations are rising; and customer-focused businesses will need to put analytics at the heart of their customer services. For example, customers do not distinguish between out of date data, and inaccurate data; for them, they are the same thing. The customer landscape is changing, and it includes the ‘millennials’ who expect technology to offer an unfailing, personal experience whilst being easy to use. This expectation extrapolates to data, and customers expect organizations to have their data correct, timely and personal.

For organizations who put customers front-and-center of their roadmap, management should encourage self-reliance in business users by ensuring that they have the right tools to provide customer-centered service.  Unfortunately, business users can suffer from a split between business intelligence reporting, and the operation systems, as a result of decoupled processes and technology at the point at which they are trying to gain insights. Often, users have to move from one reporting technology to another operational system, and then back again, in order to get the information that they need. This issue can be disruptive in terms of the workflow, and it is an obstacle to insights. In terms of IoT data, business users may have to go and get data from yet another system, and that can be even more confusing.

What does IoT mean for BI? Business Intelligence has matured from the earlier centralized technology emphasis, to a more decentralized business-focused perspective which democratizes the data for the business users. With the advent of IoT technologies, issues on collecting, refining and understanding the data are exacerbated due to the introduction of a variety of structured and unstructured data sources. In the meantime, there is an increased interest in businesses to find insights in raw data. However, this remains a challenge with the introduction of IoT data from devices, creating a mélange of data that can be difficult for business users to assimilate and understand. Companies risk obtaining lower ROI in IoT projects, by focusing only on the data, rather than the insights.

How did the industry get to this point, with disjointed technology and processes, and disconnected users? How can we move forward from here, to including IoT data whilst resolving the issues of previous business intelligence iterations? To understand this unquenchable thirst for data by business users and what it means for the future, let’s start by taking stock of the history of Business Intelligence. What are users’ expectations about data and technology in general? Until recently, these expectations have been largely shaped by the technology. Let’s start with the history lesson. What are the historical stages of Business Intelligence?

The First Generation of Business Intelligence – change in the truth

First generation Business Intelligence is the world of corporate Business Intelligence, embodied by the phrase ‘the single source of truth’. This is a very technical discipline, which focused on the extract-transform-load processing of data into a data warehouse, and focused less on business user intervention. The net result is that the business seemed to be removed from Business Intelligence. In response, the users pushed for decentralization of the data, so that they could drive their own decision making using the data flexibly, and then confirm it independently in the context in which they are operating. In terms of technology, business users reverted to the default position of using Excel as a tool to work with Excel exports, and subverting the IT departments by storing data in email.

The Second Generation of Business Intelligence – change in the users

Second Generation Business Intelligence was the change was wrought by the business users, who demanded clean data sources on a strong technical apparatus that they could mash and merge together, and they were empowered to connect to these sources. In this stage, the industry pivoted to bring the Business back into Business Intelligence, and it is typified by the phrase self-service business intelligence. The intended end result is that the business has structured data sources that the users understand, and the technical teams have a robust technical structure for the architecture in place. As before, Excel remained the default position for working with data, but the business users had more power to mash data together. Self-service business intelligence was not faultless, however. Business users were still dissatisfied with the highly-centralized IT approach as they still relied on other people to give them the data that they need. This issue introduced a delay, which increased the ‘time to answer’ metric whilst simultaneously not recognizing that this feeds into the ‘time to question’ metric. It does not recognize that analytics is a journey, and users expect to ‘surf’ through data in the same way that they ‘surf’ through the Internet.

What problems does IoT introduce for businesses, and how can we resolve them?

Given that there are inefficiencies in the process of business intelligence in organizations at the moment, how is this exacerbated by the introduction of data from devices, otherwise known as the Internet of Things? IoT data introduces new issues for business users for a number of reasons. IoT devices will transmit large amounts of data at a velocity which cannot be simply reconciled with other time-based data. The velocity of the data will add in an additional complexity as business users need to understand ‘what happened when’, and how that marries with existing data sources which may even be legacy in nature. Business users will need that work to be presented to them simply. Further, IoT devices will transmit data in different formats, and this will need to be reconciled so that the actual meaningful data is married to the existing, familiar data. If the business users are moving around disparate technology in order to match the data together, then the disconnected technology presents an obstacle to understanding the data, and thereby obtaining the promised holy grail of insights.

IoT means that we can obtain a granular level of customer data which provides unique insights into customer behavior. However, it does not immediately follow that the data will bring insights on its own; interpretation and analysis will be vital in order to obtain insights. Businesses can interpret IoT as equivalent to data from devices, and it is easy to distracted by shiny gadgetry. The ‘shiny’ approach to IoT can mean that business users are ignored in the journey, thereby shearing their insights from the solution as a whole.

Helping Business Users along the IoT Journey

As internal and external expectations on data increase, the pressure on business users will increase accordingly. Business users will need help to travel along the user journey, to adapt these changes in the data landscape that include IoT data. One solution is to add a new application that will help the business users to work with the IoT data. However, adding a new application will exacerbate the existing issues that business users experience. This might be an easy option for IT, but it will add in a new complexity for the business user. The introduction of IoT data does not necessitate the introduction of new technology to analyze the data.

IoT data resides mainly in the cloud, which means that organization’s infrastructure is changing rapidly. It will need to be reconciled and understood, regardless of where it resides. Organizations can end up with a hybrid architecture of cloud and on premise solutions, and the midst of these complex and fast-moving architectures, business users are easily forgotten. The business users will need to have a seamless environment for resolving cloud and on premise systems to enable them to product the anticipated analysis and results. Business users will find it difficult to navigate the terrain between cloud and on premise data, which will aggravate existing issues in the putting together existing data sources.

Business users have a need for data to carry out a fundamental analytical activity: comparison. How does this data compare to that data? How did last year compare to this year? How did that customer compare with this customer? Answering these simple questions may mean that traditional analytical tools may not be able to cope with the new types of data that are generated by IoT technologies, because the data will be disconnected in terms of technology and process. Excel is excellent for rectangular data sources, but it is not designed for data sources where the data travels at such velocity, and in non-rectangular shapes. So, what’s next?

The Third Generation of Business Intelligence – change in the data

The Third Generation of Business Intelligence is where users work in the era of real change in data, and it is this change is wrought by changes in the data itself. The data has changed fundamentally; it now travels faster, has more shapes, and is bigger in size than ever before. Users are not finding it easy to compare data simply by farming data into Excel; they need to be empowered to tackle seemingly simple business questions, like comparison, in a way that fits with a fluid way of working whilst being empowered with data from the IoT sphere as well as more traditional sources.  By tapping into new volumes and particularly new varieties of data, organizations can ask questions about their customers and their business in a way that they have never been able to do, previously. Further, when we add IoT into the mix, there is a promise of insights from customers and their environments, which can be incredibly valuable to companies. It is not all one way, however: In this era of tech-savvy consumers, customer relationships require planning, nurturing, and constant attention.

There should be an acceptance that business users will want access to IoT sources in the same way as any other source, but these can be exasperating and non-intuitive. Simplicity is vital in the race for the business analyst and user, and their goal is to reduce time and effort in getting the insights that they want, whilst increasing efficiency and value.

So, what gets forgotten in the IoT torrent of attention, and what can we do about it?

Simply put, the business users get lost. They are already getting lost frequently with BI projects, and this will only make matters worse for IoT projects.The ones who mash data together, clean it, make decisions on it, and put it next to other data sources in order to make sense of the data – these are the ones who should be using the data.

Given all of these issues, how do we bring the users back into an IoT architecture? I was faced with this issue recently, when designing an IoT architecture which had a focus on machine learning. IoT work involved a great deal of complexity, which is neatly hidden behind the buzzwords.

The changes in data now mean that there is a clear extension of where the industry has come from, and where it is headed. So what comes next? The third generation of business intelligence: ready to go analytics using data regardless of its shape and size.

Organisations will need to focus on the third generation of Business Intelligence if they are to be successful in facilitating users to have the access to data that they need. Users will want to try and analyse the data themselv es. Fast questions need fast answers, and businesses need to move from their initial business question through to the resulting insight quickly and accurately, in a way that they are comfortable. They also need results at the velocity of the business; answers when they need them. Remembering the users is a deceptively simple requirement that presents a number of challenges.

The dislocation between IT and the business is at its apex when we look at the opposing approaches to data. IT is still seen as a bottleneck rather than an enabler. Business users perceive IT departments as a lag in a process that needs to get from question to insight quickly, and they will look for ‘good enough’ data rather than ‘right data’ in order to get it. The way forward is to make the business users’ activities simpler whilst providing a solution that the IT department are closely involved and find the solution easier to support, so that both parties feel that they own the solution.

The solution should put the focus back on the business users who not on the humans who actually deliver service, create insights, and ultimately add business value. To do this, they need to be able to search for meaning in the data, via aggregation, broadcasting and consuming information in order to add the value that is expected of them.

To summarise, these issues were at the forefront of my mind, when I was architecting an IoT solution recently. In my next post, I will explain my technical choices, based on these considerations. On my survey, it was clear that IoT needs to be taken to a further stage so that it is usable, actionable and sensible; not just data about sensors, but data that is relevant and provides insights.

If you want to talk more about the IoT issues here, or you’re interested in having me come along and speak at your event or workplace, please email me at Jen.stirrup@datarelish.com

Simple explanation of a t-test and its meaning using Excel 2013

For those of us you say that stats are ‘dry’ – you are clearly being shown the wrong numbers! Statistics are everywhere and it’s well worth understanding what you’re talking about, when you talk about data, numbers or languages such as R or Python for data and number crunching. Statistics knowledge is extremely useful, and it is accessible when you put your brain to it!

So, for example, what does a pint of Guinness teaches us about statistics? In a visit to Ireland, Barack Obama declared that the Guinness tasted better in Ireland, and the Irish keep the ‘good stuff’ to themselves.
Can we use science to identify whether there is a difference between the enjoyment of a pint of Guinness consumed in Ireland, and pints consumed outside of Ireland?
A Journal of Food Science investigation detailed research where four investigators travelled around different countries in order to taste-test the Guinness poured inside and outside of Ireland. To do this, researchers compared the average score of pints poured inside of Ireland versus pints poured outside of Ireland.

How did they do the comparison? they used a t-test, which was devised by William Searly Gosset, who worked at the Guinness factory as a scientist, with the objective of using science to produce the perfect pint.
The t-test helps us to work out whether two sets of data are actually different.
It takes two sets of data, and calculates:

the count – the number of data points
the mean, also known as the average i.e. the sum total of the data, divided by the number of data points
The standard deviation – tells you roughly how far, on average, each number in your list of data varies from the average value of the list itself.

The t-test is more sophisticated test to tell us if those two means
of those two groups of data are different.

In the video, I go ahead and try it, using Excel formulas:

COUNT – count up the number of data points. The formula is simply COUNT, and we calculate this
AVERAGE – This is calculated using the Average Excel calculated formula.
STDEV – again, another Excel calculation, to tell us the standard deviation.
TTEST – the Excel calculation, which wants to know:

Array 1 – your first group of data
Array 2 – your second group of data
Tail – do you know if the mean of the second group will definitely by higher OR lower than the second group, and it’s only likely to go in that direction? If so, use 1. If you are not sure if it’s higher or lower, then use 2.
Type –
if your data points are related to one another in each group, use 1
if the data points in each group are unrelated to each other, and there is equal variances, use 2
if the data points in each group are unrelated to each other, and if you are unsure if there are equal variances, use 3

And that’s your result. But how do you know what it means? It produces a number, called p, which is simply the probability.

The t-test: simple way of establishing whether there are significant differences between two groups of data. It uses the null hypothesis: this is the devil’s advocate position, which says that there is no difference between the two groups It uses the sample size, the mean, and the standard deviation to produce a p value.
The lower the p value, the more likely that there is a difference in the two groups i.e. that something happened to make a difference in the two groups.

In social science, p is usually set to 5% i.e. only 1 time in 20, is the difference due to chance.
In the video, the first comparison does not have a difference ‘proven, but the second comparison does.
So next time you have a good pint of Guinness, raise your glass to statistics!

 


 

Love,
Jen

Jen’s PASS Diary: SQLSaturday Edinburgh: My heartfelt thanks go to…

SQLSaturday Edinburgh went ahead last Saturday, June 13th, and everyone had a great day. It’s clear that people in the community believe in what I am doing. They voted with their feet to attend, to speak, and to sponsor. We had high quality speakers delivering world-class content – 8 MVPs, 2 Microsoft staff, and the remainder are international speakers – and we know that Content is King.

Basically, SQLSaturday Edinburgh Business Intelligence edition was the turning point for the Business Analytics and Business Intelligence ( SQL Server based ) community in the UK.

  • Our event only had five people who had spoken at SQLBits (Carmel Gunn, Bob Duffy, Gary Short, Chris Webb and Satya Jayanty).
  • three of our speakers (Mark Wilcock, Chris Webb and Bob Phillips) all spoke at PASS Business Analytics Conference last April in San Jose, and they all spoke at PASS SQLSaturday London Business Analytics in November 2014.
  • The other speakers have delivered sessions internationally in their field of expertise: Visio, SharePoint, CRM, and this was the first time they’d spoken at a SQL event.

We tried to be more BI and BA focused, and did it work? The feedback so far is a resounding YES. We didn’t try to squeeze the formula for other SQL events onto this one, jam some R in there, and announce it as an analytics event. The content was focused on what we do with data, why, and what the business value is.There will be more on this in future posts. In the meantime, however, I have a lot of thank yous!

I also want to say a heartfelt thank you to the volunteers, without whom, the event would not have happened.

  • Malcolm Smith
  • Izabela Borzecka
  • Robert French
  • Melissa Coates ( Twitter ) who helped by collating templates from her events to use.
  • Prathy Kamasani ( Twitter ) who is just simply amazing. Her smile lifts me and she has really helped to keep me going with her sunny attitude and unfailing support.
  • Rodney Kidd ( Twitter ) has been a rock and a great listener, as well as a helpful, kind gentleman.

Prathy, Rodney – I cannot thank you enough, and your friendship and support will stay with me forever. Thank you.

I want to thank the following sponsors for putting themselves forward to support me in what I’m doing for the Business Intelligence and emerging Business Analytics community in the UK. Without them, there would be no event. Fact.

SQLSaturday Edinburgh 388 Sponsors

I also want to thank our amazing SQLSaturday speakers. If you’d like to download their slides, you will find them on the site.

The speakers were, in order of appearance:

Jon Woodward ( Twitter / Website )

Iain Elder (Twitter)

David Parker ( Website )

Chris Webb ( Website / Twitter )

Ian MacDonald ( Website )

Adam Vero ( Website )

Bob Duffy ( Website / Twitter )

Carmel Gunn ( Website / Twitter )

Peyman Blumstengel ( Website )

Murali Nagaraj ( Website )

Peter Baddeley ( Website / Twitter )

Tom Sykes ( Twitter )

Niall MacLeod ( Website / Twitter )

Mark Wilcock ( Website / Twitter )

Bob Phillips ( Twitter )

Dave Lawrence ( Website / Twitter )

Tim Jones ( Website / Twitter )

Jean-Pierre Riehl ( Twitter / Website )

Gary Short ( Twitter / Website )

Satya Jayanty ( Twitter / Website )

Ric Howe ( Twitter / Website )

If I have missed anyone, it will be a genuine oversight due to a very tired little Jen missing things out!

I owe people emails so please forgive me until I catch my breath! Please bear with me. I’m doing my best.

Love always,

Jen Stirrup

David McCandless announced as PASS BAC keynote speaker

The Professional Association of SQL Server announced this morning that acclaimed data visualization expert, TedTalk speaker, and Information Is Beautiful author David McCandless will keynote at the 2nd annual PASS Business Analytics Conference in San Jose, CA, May 7-9.

 
David will take the stage on Day 2 for a journey through the world of visualizing facts, data, ideas, and statistics. Microsoft’s Kamal Hathi and Amir Netz will kick off the conference on Day 1 with insights into how Microsoft is making data more accessible through easy-to-use tools including Power BI. Exciting times, people!
 
The PASS BA Conference – featuring 65+ sessions across five topic tracks – brings together business analysts, data scientists, and BI and IT professionals to connect with each other, share experiences, and learn more about the power of data to transform business. You can find all the details at http://passbaconference.com. On a minor note, I’m speaking too during a general session (not the keynote, obviously!) and I hope to see you there!
 
I am REALLY looking forward to David McCandless speak, and it would be wonderful to meet him in person. I hope you can make it to PASS BAC. If you are looking for a discount code, here is mine for you to use to get $150 off: BASF2O