PASS BA Visual Data Storytelling precon session with Mico Yuk

CEO of BI Brainz | Author | Global Keynote Speaker | BI Influencer | Trainer | Blogger| BI Executive Advisor *@micoyuk*

I am super excited that Mico Yuk is joining us again at PASS BA Conference!

aaeaaqaaaaaaaasjaaaajdfkmzdiywy2lwizmtktnduxyy05zmfilwy1owyzodgyyjzmzaMico Yuk is well known in the Business Intelligence ecosystem as a community leader, BI influencer, controversial blogger, and the founder of the highly rated BI Coaching Series,
the BI Dashboard Formula (aka BIDF). Headquartered in Atlanta, GA, her team of senior coaches and consultants work with Executives to transform their BI teams to meet the challenges in the new era of BI through a series of coaching, training, and consulting services.Mico’s most recent accomplishments include being named one of the Top 50 Analytics Bloggers by SAP and being rated a #1 global keynote speaker at a number of global BI conferences.

A computer engineer by degree, she has been designing and implementing enterprise dashboards for major corporate clients since 2006 and is considered to be one of the top data visualization experts in the world.

First as a consultant and now through her company, she has helped to implement executive dashboard and reporting using the SAP BusinessObjects platform for customers such as Allstate, Pfizer, Aviva Canada, McKesson, Ryder Logistics, Digicel Jamaica, QatarGas, St. Jude Medical, Walgreens, Chiquita, LG, the US Airforce, Medtronics, SAP Global Marketing, Amtrak, Fresh Direct, Bank of America, and Nestle, to name a few.

To find out more about Mico, please visit http://micoyuk.com.

Visual Storytelling – How to Tell a “Compelling” Data Story That Matters to Your Users

This business-oriented, hands-on session will provide the foundation necessary to make your data visualizations more intelligent, actionable, and useful! Whether you are a beginner or a data visualization veteran, this session will guide you on telling more compelling stories with your data, from storyboarding fundamentals to more advanced techniques such as how to add smart context and visual cues. Attendees will learn:

• how to create a simple four-part visual storyboard on paper in minutes, not weeks
why visual storytelling is more effective than traditional reporting
• the one element 98% of data visualizations are missing and how it is negatively affecting user adoption

Format: Half-Day Classroom (Afternoon)

Register here

 

PASS BA Header Mico

 

Jen’s PASS Diary: So what happened to SQLSaturday Edinburgh?

As always, I don’t officially represent PASS on this post (or any other).

I have been working on a lot of things, including attending a PASS Board Meeting, and doing a lot on the Business Analytics Portfolio. However, the focus of attention here is my failed SQLSaturday Edinburgh event. So what happened? The outcome is as follows:

  • SQLSaturday Edinburgh is cancelled
  • Edinburgh Power BI is cancelled
  • Three precons out of four have been cancelled.
  • I lost a few thousand pounds
  • My own precon is still going on; see previous point. I am trying to stem some of the losses

This has meant:

  • Cancelling the University of Edinburgh venue
  • Cancelling precon venue rooms
  • Many disappointed attendees – 51 people in all
  • Many disappointed potential speakers, who submitted 98 submissions from every continent in the world.

Why did Edinburgh fail?

Note that I’ve successfully gained sponsorships for the three past SQLSaturday Edinburgh events, SQLRelay, I’ve sent companies to SQLBits who eventually sponsored, and I’ve helped with sponsorships for PASS BA Analytics. Despite this wealth of experience, the stars were not aligned for Edinburgh.

I didn’t get sponsorship, apart from the fantastic CozyRoc. The CozyRoc team have been wonderfully supportive in my hour of need, and they have said that this won’t stop them supporting future events in Edinburgh, or me individually. I am hugely grateful for their support and I’d like to publicly thank them here.

Why didn’t I get sponsorship? Let’s look at the landscape of events:

There are also a bunch of other European SQLSaturdays which are in the lead up to my event. In one rejection email, one would-be sponsor pointed out to me that there are four SQLSaturdays in my area before my event, and explained that they couldn’t support Edinburgh because of the concentration of events around that time.

  • Kiev – event being held in 20th May
  • Plovdiv – event being held at 28th May
  • Krasnodar –  4th June
  • Rheinland – 11th June

If you look at the list, you can see that there are 7 SQL events between 2nd May and 11th June. This doesn’t include my failed Edinburgh event, which would take the total to 8 events in 47 days. That’s a lot, by anyone’s standards.

Further, this does not include the other events which take place after 11th June – Dublin, on 18th June, and Paris, on June 25th. This means that the total goes up to ten SQL events in the space of 9 weeks, three of which are major events – SQLBits, SQLNexus and SQLDay Poland.

I have been criticised for a number of things, and I will lay them out here:

– not responding to sponsor feedback . This simply isn’t true. I negotiated and offered to cut prices whilst giving more benefits. I bent and shifted and adapted to get cashflow in – hell, any cash flow. However, even offering sponsorship at rock bottom prices wasn’t enough. The answer was no, no and no again, and the same reasons given were the ones that I couldn’t do anything about – too many SQL events concentrated geographically and temporally. This wasn’t in my control.

– I am on the PASS Board, and apparently it’s a mark of my failure as an individual that I couldn’t organise a SQLSaturday. This isn’t true. I’ve organized a ton of events, and I do a lot for the community. Note the following points:

It does show integrity that I didn’t ‘pull strings’ to make being a Board member an advantage in some way, and I am clearly lost out in many ways, including financially.

I have seen, felt, and paid for, the ‘hard edge’ of being part of the SQL Server / Data Platform community. I have war wounds. This makes me a better Board member, I think, because I can speak for the ‘little guy’ and I am struggling and suffering due to this issue, and I think it can help me to empathise. I think I won votes because I ran successful events, partly. however, I think that the fact I have been there when an event fails, means that I can work to make sure that it doesn’t happen again.

– I’ve been told that I shouldn’t have organised my event temporally and geographically near the behemoth SQLBits event. Liverpool is about 200 miles away from Edinburgh, give or take. The events were six weeks apart.  I did not know the dates of SQLBits until the last moment, and I acted without being given the data. I had been given reassurances that the SQLBits dates wouldn’t impact PASS BA; however, it then turned out that they are in the same week. Be very clear; If I had known the SQLBits dates, I would have postponed Edinburgh. Full stop. I wasn’t told, so I didn’t know. The reasons for the communication failure are still a mystery to me, but I do wish that someone had said something. They would have helped me a lot in many ways by speaking/emailing me to forewarn me. I would have postponed Edinburgh so that it took place later year. Happy everyone.

To summarise, the main reason for the failure of Edinburgh was the lack of sponsorship; one repeated message given to me that there are simply too many SQL / Data events and the growth of events is outpacing sponsors’ ability to keep up and they are having to make difficult choices.

I would hate it if another volunteer went through this again in the future, and I’d like to think strategically to prevent it from happening to anyone else. So what’s the call to action?

  • If you consider yourself a European Data Platform technical leader, please get in touch. I am working with Microsoft to set up a Yammer group that I will invite you to. We can swap ideas, event dates, and basically facilitate communication.My email is jen dot stirrup at datarelish dot com. I am findable! Find me.
  • I’d like to set up a European Twitter chat on the first Monday of every month. This will take place at 5pm UK time, 6pm CET. The hashtag will be #EUSQL. Let’s talk about anything and everything European and SQL. I will be ‘chair’ but I will need volunteer ‘chairs’ too. If this is you, get in touch.
  • I will set up a European Linked In Group for European SQL leaders and participants. I will release details in due course. I’d like help with this; if this is you, please email me.

I want to thank the following people for offering to help, in no order:

Brent Ozar

Andrew Brust

Mike Hillwig

Neil Hambly

Stephanie Locke

Jonathan and Annette Allen

Satya Janyanty

Niko Neugebauer

PASS team – Sonya, Vicki, Angie, Teresa

PASS Board – everyone 🙂  Adam, Tom, Denise, Wendy, Tim, Craig, Argenis, Ryan, Allen

my family – Andrew S (my brother)

Microsoft – particularly Jen M and Jonathan W

Let’s work together so that nobody else is in this same situation. Please help me – to help YOU – avoid this situation.

[ Update: original comment was removed. I believed that someone was having a go at me. I will just leave that here.]

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

PASS Summit Notes for my AzureML, R and Power BI Presentation

I’m going to have fun with my AzureML session today at PASS Summit! More will follow on this post later; I am racing off to the keynote so I don’t have long 🙂

I heard some folks weren’t sure whether to attend my session or Chris Webb’s session. I’m honestly flattered but I’m not in the same league as Chris! I’ve posted my notes here so that folks can go off and attend Chris’ session, if they are stuck between the two.

Here is the order of things:

  • Slide Deck
  • How do you choose a machine learning algorithm?
  • How do you carry out an AzureML project?
  • AzureML Experiment 
  • R Code

So, the slide deck is here:

  • AzureML Experiment 

You can see this experiment in the AzureML Gallery. You may have to sign up for a Windows Live account to get a free AzureML studio account, and I recommend that you do.

  • How do you choose a machine learning algorithm?

Kudos to Microsoft – this is their cheatsheet and I recommend that you look at the original page.

Here is some more information on the topic from Microsoft, and I recommend that you follow it.

How do you carry out an AzureML project?

Try the CRISP-DM Framework for a start

See the Modelling Agency for the original source. https://the-modeling-agency.com/crisp-dm.pdf

CRISP-DM Process Diagram.png
CRISP-DM Process Diagram” by Kenneth JensenOwn work. Licensed under CC BY-SA 3.0 via Commons.

R Code

Here’s a sample R code. I know it is simple, and there are better ways of doing this. However, remember that this is for instructional purposes in front of +/- 500 people so I want to be sure everyone has a grounding before we talk more complicated things.

You may have to install the libraries first, if you haven’t done so.

library(data.table)
library(ggplot2)
library(xtable)
library(rpart)
require(xtable)
require(data.table)
require(ggplot2)
require(rpart)

summary(adult.data)
class(adult.data)

# Let’s rename the columns
names(adult.data)[1]<-“age”
names(adult.data)[2]<-“workclass”
names(adult.data)[3]<-“fnlwgt”
names(adult.data)[4]<-“education”
names(adult.data)[5]<-“education.num”
names(adult.data)[6]<-“marital.status”
names(adult.data)[7]<-“occupation”
names(adult.data)[8]<-“relationship”
names(adult.data)[9]<-“race”
names(adult.data)[10]<-“sex”
names(adult.data)[11]<-“capital.gain”
names(adult.data)[12]<-“capital.loss”
names(adult.data)[13]<-“hours.per.week”
names(adult.data)[14]<-“country”
names(adult.data)[15]<-“earning_level”

# Let’s see if the columns renamed well
# What is the maximum age of the adult?
# How much data is missing?
summary(adult.data)

# How many rows do we have?
# 32561 rows, 15 columns
dim(adult.data)

# There are lots of different ways to deal with missing data
# That would be a session in itself!
# For demo purposes, we are simply going to replace question marks, and remove rows which have anything missing.

adult.data$workclass <- as.factor(gsub(“[?]”, NA, adult.data$workclass))
adult.data$education <- as.factor(gsub(“[?]”, NA, adult.data$education))
adult.data$marital.status <- as.factor(gsub(“[?]”, NA, adult.data$marital.status))
adult.data$occupation <- as.factor(gsub(“[?]”, NA, adult.data$occupation))
adult.data$relationship <- as.factor(gsub(“[?]”, NA, adult.data$relationship))
adult.data$race <- as.factor(gsub(“[?]”, NA, adult.data$race))
adult.data$sex <- as.factor(gsub(“[?]”, NA, adult.data$sex, fixed = TRUE))
adult.data$country <- as.factor(gsub(“[?]”, NA, adult.data$country))

is.na(adult.data) = adult.data==’?’
is.na(adult.data) = adult.data==’ ?’
adult.tidydata = na.omit(adult.data)

# Let’s check out our new data set, called adult.tidydata
summary(adult.tidydata)

# How many rows do we have?
# 32561 rows, 15 columns
dim(adult.tidydata)

# Let’s visualise the data
boxplot(adult.tidydata$education.num~adult.tidydata$earning_level,outline=F,xlab=”Income Level”,ylab=”Education Level”,main=”Income Vs Education”)

prop.table(table(adult.tidydata$earning_level,adult.tidydata$occupation),2)
for (i in 1:ncol(adult.tidydata)-2) {
if (is.factor(adult.tidydata[,i])){
pl =ggplot(adult.tidydata,aes_string(colnames(adult.tidydata)[i],fill=”earning_level”))+geom_bar(position=”dodge”) + theme(axis.text.x=element_text(angle=75))
print(pl)
}

}

evalq({
plot <- ggplot(data = adult.tidydata, aes(x = hours.per.week, y = education.num,
colour = hours.per.week))
plot <- plot + geom_point(alpha = 1/10)
plot <- plot + ggtitle(“Hours per Week vs Level of Education”)
plot <- plot + stat_smooth(method = “lm”, se = FALSE, colour = “red”, size = 1)
plot <- plot + xlab(“Education Level”) + ylab(“Hours per Week worked”)
plot <- plot + theme(legend.position = “none”)
plot
})

That’s all for now! More later.

Jen xx

Jen’s Diary: What does Microsoft’s recent acquisitions of Revolution Analytics mean for PASS?

Caveat: This blog does not represent the views of PASS or the PASS Board. These opinions are solely mine.

The world of data and analytics keeps heating up. Tableau, for example, keeps growing and winning. In fact, Tableau continues to grow total and licence revenue 75% year over year, with its total revenue grew to $142.9 million in the FY4 of 2014.There’s a huge shift in the market towards analytics, and it shows in the numbers. Lets take a look at some of the interesting things Microsoft have done recently, and see how it relates to PASS:

  • Acquired Revolution Analytics, an R-language-focused advanced analytics firm, will bring customers tools for prediction and big-data analytics.
  • Acquired Datazen, a provider of data visualization and key performance indicator data on Windows, iOS and Android devices. This is great from the cross-platform perspective, and we’ll look at this in a later blog. For now, let’s discuss Revolution and Microsoft.

Why it was good for Microsoft to acquire Revolution Analytics

The acquisition shows that Microsoft is bolstering its portfolio of advanced analytics tools. R is becoming increasingly common as a skill set, and businesses are more comfortable about using open source technology such as R. It is also accessible software, and a great tool for doing analytics. I’m hoping that this will help organisations to recognise and conduct advanced analytics, and it will improve the analytics capability in HDInsight.

Microsoft has got pockets of advanced analytics capabilities built into Microsoft SQL Server, and in particular, SQL Server Analysis Services, and also in the SQL Server Parallel Data Warehouse (PDW). Microsoft also has the Azure Machine Learning Service (Azure ML) which uses R in MLStudio. However, it does not have an advanced analytics studio, and the approach can come across as piecemeal for those who are new to it. The acquisition of Revolution Analytics will give Microsoft on-premises tools for data scientists, data miners, and analysts, and cloud and big data analytics for the same crowd.

Here’s what I’d like Microsoft to do with R:

  • Please give some love to SSRS by infusing it with R. There is a codeplex download that will help you to produce R visualisations in SSRS. I’d like to see more and easier integration, which doesn’t require a lot of hacking about.
  • Power Query has limited statistical capability at the moment. It could be expanded to include R. I am not keen for Microsoft to develop yet another programming language and R could be a part of the Power Query story.
  • Self-service analytics. We’ve all seen the self-service business intelligence communications. What about helping people to self-serve analytics as well, once they’ve cracked self-service BI? I’d like to see R made easier to use for everyone. I sense that will be a long way off, but it is an opportunity.
  • Please change the R facility in MLStudio. It’s better to use RStudio to create your R script, then upload it.

What issues do I see in the Revolution Analytics acquisition?

Microsoft is a huge organisation. Where will it sit within the organisation? Any acquisition involves a change management process. Change management is always hard. R touches different parts of the technology stack. This could be further impacted by the open source model that R has been developed under. Fortunately Revolution seem to have thought of some of these issues already: how does it scale, for example? This acquisition will need to be carefully envisioned, communicated and implemented, and I really do wish them every success with it.

What does this mean for PASS?

I hold the PASS Business Analytics Portfolio, and our PASS Business Analytics Conference is being held next week. Please use code BFFJS to get the conference for a discount rate, if you are interested in going.

I think the PASS strategy of becoming more data platform focused is the right one. PASS exist to provide technical community education to data professionals, and I think PASS are well placed to move on the analytics journey that we see in the industry. I already held a series on R for the Data Science Virtual Chapter, and I’m confident you’ll see more material on this and related topics. There are sessions on R at the PASS BA Conference as well. The addition of Revolution Analytics and Datazen is great for Microsoft, and it means that the need for learning in these areas is more urgent, not less. That does not mean that i think that everyone should learn analytics. I don’t. However, I do think PASS can help those who are part of the journey, if they want (or need) to be.

I’m personally glad PASS are doing the PASS Business Analytics Conference because I believe it is a step in the right direction, in the analytics journey we see for the people who want to learn analytics, the businesses who want to use it, and the burgeoning technology. I agree with Brent Ozar ( b / t ) in that I don’t think that the role of the DBA is going away. I do think that, for small / medium businesses, some folks might find that they become the ‘data’ person rather than the DBA being a skill on its own. I envisage that PASS will continue to serve the DBA-specialist-guru as well as the BI-to-analytics people, as well as those who become the ‘one-stop-shop’ for everything data in their small organisation (DBA / BA / Analytics), as well as the DBA-and-Cloud person. It’s about giving people opportunity to learn what they want and need to learn, in order to keep up with the rate of change we see in the industry.

Please feel free to comment below.

Your friend,

Jen Stirrup

x

What’s so unique about PASS Business Analytics? The Hands On Labs built in as part of the conference, that’s what!

PASS Business Analytics are holding scheduled Hands on Labs as part of the conference.

This means you can book a lab, and get real life, hands-on experience.

That’s not all – you get a Hands On Lab which is held by a real expert – not just someone who reads off a script. We have labs with the following people:

  • Dean Abbot
  • Chandoo
  • Dan Fylstra
  • Ken Puls
  • Scott Shaw

What are you waiting for? Register now and use the following code to register here to get the conference for $1295

See you there!

Jen’s Diary – Time to Answer, or Time to Question? Plus some EMEA thoughts

Hello again,

As always, I do not speak officially for PASS. This is my diary, and a bit of a brain dump.

I’m as busy as ever with PASS Business Analytics Conference, and things are going well. I’m helping to socialize the information about BAC, and I’m dealing informally with sponsors and community members and speakers, in the run up to the event. I’m starting to think about how we continue the BA conversation post-BAC, and there will be more of this discussion in the future. As well as PASS BAC, I am the lead organizer of SQLSaturday Edinburgh, Business Intelligence edition. If you are interested, take a look at our schedule and you will start to see the difference between the BI edition and the normal, full-fat SQLSaturday. There is a focus on data, and in the case of my SQLSaturday Edinburgh BI edition, we are looking at data across traditional Product Groups. Therefore, we have C#, CRM, Access, Visio and SQL Server MVPs speaking, as well as well-known community SharePoint and Business Analytics speakers.

The underlying focus is on data and analytics, and I know that other SQLSaturday organizers are watching the Edinburgh event with interest to see if this approach resonates with the community. This focus on data and analytics is much more than simply taking SQL Server and Azure topics and jamming some R in there as well; it is perfectly possible to talk about R and not mention statistics or analytics once – R is a very wide technology. Business Analytics for me,  an attitude of taking the business into perspective with a focus on business value, business insights, and actionable takeaways, and the Edinburgh schedule will become more clear on this topic in due course when we release our dedicated Analytics track.

Here is an example: are you interested in time-to-answer, or time-to-question? In Business Intelligence, you are interested in time-to-answer. You write your report, you get your answer, and people want the answer quickly. Business Analytics is about time-to-question, or, more specifically, time from the original question until the time you receive the next business question. You may have an answer, but the business users have another question; so in this case, you are all about shortening the time from the question, until you receive the next question. The questions will be focused on ‘what happened’ but they will also be focused on ‘why’ and ‘what do we do next’? The time-to-question metric will also take into account the fact that you are making predictions on your data, which feeds into the next question that the business will ask. Notice that I haven’t mentioned technology here; technology-focused sessions aren’t always Business Analytics presentations because they will be focused on technology ‘time to answer’ topics rather than business focused ‘time to question’ topics. So, R != Business Analytics, for example – it is about the business question you are asking, not the technology you are using.

If you are interested in attending PASS Business Analytics Conference, I have the biggest discounts 🙂 so please don’t hesitate to get in touch.

Love and friendship,

Jen Stirrup

PASS BA Speaker focus: What did Steve Jobs have to say about Daniel Fylstra?

PASS BA Conference is delighted to have Daniel Fylstra speaking at our conference. In fact, he was the first speaker we signed up.

Here’s what Steve Jobs had to say:

“There have been two real explosions that have propelled the industry forward. The first one really happened in 1977, and it was the spreadsheet. I remember when Dan Fylstra, who ran the company that marketed the first spreadsheet, walked into my office at Apple one day and pulled out this disk from his vest pocket and said “I have this incredible new program — I call it a Visual Calculator,” and it became VisiCalc. And that’s what really drove — propelled — the Apple II to the success it achieved.”

I think it’s great that we have such a visionary attending our PASS BA Conference. I’m looking forward to meeting Daniel, and picking up his insights from 30 plus years in the industry. Jobs mentions an event that took place when I was only six years old, and look at how the industry has grown since then. We have Excel’s 30th Birthday this year.

We can really say that the team of Dan Bricklin, Bob Frankston and Daniel Fylstra are great innovators who have really changed the industry fundamentally. Although they didn’t invent the spread sheet, it was Daniel Fylstra suggested it would be a viable product if it could run on an Apple II computer. VisiCalc was born.

Ever used the Microsoft Excel Solver? Well, here is Daniel Fylstra’s paper on the Microsoft Excel Solver, and its design and use. If you’ve ever used it, you should tip your hat to the great team who put it together. Daniel is one of those people. More recently, Fylstra has been working on Integrated simulation, data mining and optimization in Excel. Here’s a recent paper here, from the ACM.

From this insight, there are great innovators and thinkers that can we learn from them. They seek better ways of doing things, about bringing insights from other disciplines. We are inspired by people who see the bigger picture, and I’m personally looking forward to meeting Daniel and thanking him for all he has done for this industry.

Make sure you sign up for the Conference, and take the opportunity to meet a legend in the industry.