Data Science VC follow up: Intro to R and Statistics for the Rookie Part 1

I’ve decided to hold a five week introductory statistics and R course. Here, I am sharing the slide deck and the code. The video will go up on our YouTube Data Science Virtual Chapter channel, which is accessible from here.

In the first week, we talked about the relationship between statistics and data visualisation, and how it is extremely useful to have a good grounding in both topics. The slides are here, followed by the code:

The R code can be copied and pasted into your RStudio file:

# Loads sample datasets

# Let’s look at the data
# This command tells you the metadata. What does R see, when it sees ‘iris’?

# What are the attributes?
# This gives us more information.

# Let’s see more of the data

# A data frame has columns which can have different types.
# The column names and types constitute the schema.
# how do we know what is in our data frame?

# Column Names

# how can we see data in one of the columns?
# or we could also use iris[,3] to get the same column data.

# of course, we want to visualise the data.
# Let’s do a simple scatter plot.

# How can we see the first five rows?

# how can we see the Petal Length of the first 5 rows?
iris[1:5, “Petal.Length”]

# This shows us some of the descriptive statistics of each variable


# Let’s have some dataviz fun!
plot(iris$Petal.Length, iris$Petal.Width, main=”Anderson’s Iris Data”)
# You can now see the plot appear in the right hand side frame of RStudio.
# we can make it slightly more interesting
plot(iris$Petal.Length, iris$Petal.Width, pch=23, bg=c(“orange”, “blue”, “green”) [unclass(iris$Species)], main=”Anderson’s Iris Data”)

# we can make it even more interesting
pairs(iris[1:4], main = “Anderson’s Iris Data”, pch = 23, bg = c(“orange”, “green”, “blue”)[unclass(iris$Species)])
# pie charts!

# ooh, 3D!

scatterplot3d(iris$Petal.Width, iris$Sepal.Length, iris$Sepal.Width)

# ooh, even more 3D!
plot3d(iris$Petal.Width, iris$Sepal.Length, iris$Sepal.Width)

#Save your work!

savehistory(“~/Topic 1 Getting familiar with R A.Rhistory”)

PASS BA Speaker focus: What did Steve Jobs have to say about Daniel Fylstra?

PASS BA Conference is delighted to have Daniel Fylstra speaking at our conference. In fact, he was the first speaker we signed up.

Here’s what Steve Jobs had to say:

“There have been two real explosions that have propelled the industry forward. The first one really happened in 1977, and it was the spreadsheet. I remember when Dan Fylstra, who ran the company that marketed the first spreadsheet, walked into my office at Apple one day and pulled out this disk from his vest pocket and said “I have this incredible new program — I call it a Visual Calculator,” and it became VisiCalc. And that’s what really drove — propelled — the Apple II to the success it achieved.”

I think it’s great that we have such a visionary attending our PASS BA Conference. I’m looking forward to meeting Daniel, and picking up his insights from 30 plus years in the industry. Jobs mentions an event that took place when I was only six years old, and look at how the industry has grown since then. We have Excel’s 30th Birthday this year.

We can really say that the team of Dan Bricklin, Bob Frankston and Daniel Fylstra are great innovators who have really changed the industry fundamentally. Although they didn’t invent the spread sheet, it was Daniel Fylstra suggested it would be a viable product if it could run on an Apple II computer. VisiCalc was born.

Ever used the Microsoft Excel Solver? Well, here is Daniel Fylstra’s paper on the Microsoft Excel Solver, and its design and use. If you’ve ever used it, you should tip your hat to the great team who put it together. Daniel is one of those people. More recently, Fylstra has been working on Integrated simulation, data mining and optimization in Excel. Here’s a recent paper here, from the ACM.

From this insight, there are great innovators and thinkers that can we learn from them. They seek better ways of doing things, about bringing insights from other disciplines. We are inspired by people who see the bigger picture, and I’m personally looking forward to meeting Daniel and thanking him for all he has done for this industry.

Make sure you sign up for the Conference, and take the opportunity to meet a legend in the industry.

PASS BA Conference: Why I’m delighted Mico Yuk is keynoting!

Quick note: If you are looking for the biggest discount possible for the PASS BA Conference, please email me at

I am on fire about our Keynote Speaker, Mico Yuk! Mico is someone I admire very much. She’s a Business Intelligence coach, thought-leader, author, blogger, and data scient, author of Data Visualization for Dummies, and we are very fortunate that Mico has agreed to keynote at the PASS Business Analytics Conference in Santa Clara, CA, April 20-22. Mico will leading the charge of all-star lineup of top speakers from across the world of business and data analytics. Not only is Mico an expert in her fields, but her experience, passion and belief in the power of Business Analytics and Data Visualisation mean that she is the right person to lead the charge at PASS Business Analytics Conference. I’m so delighted that she has come on board. I’ve been fortunate enough to speak with Mico on the phone, as well as listen to a lot of her sessions on YouTube, and I cannot wait until I finally meet her in person.

Nearly a decade’s experience of using and teaching thousands globally how to harness the power of data visualization to create real business results, Mico is the founder of BI Brainz, a leader in enterprise visual solutions, and the BI Dashboard Formula (BIDF) coaching series, focusing on helping enterprises embrace a more visual culture. Named one of the Top 50 Analytics Bloggers, Mico speaks and blogs about all things BI-related.

A world-class lineup of business and data analytics experts from Revolution Analytics, Microsoft, GoDaddy, IBM, Gigaom Research (f which I’m fortunate enough to be a Gigaom Analyst!) , and other leading companies – along with a “Dream Team” of Excel masters – will be joining Mico to share the latest best practices for leading organizations from data to insights to action.

The event will also showcase five, full-day pre-conference workshops on April 20:

  • 3 Tools an Hour – 24 FREE Tools Every Business Analyst Needs – Lynn Langit
  •  An Overview of Predictive Analytics for Practitioners – Dean Abbott
  •  Building Awesome, Interactive & Advanced Charts with Excel – Chandoo (Purna Duggirala)
  •  Join the Power BI & Excel Revolution – Rob Collie and Avi Singh
  •  Stay Calm and Data On: How to Dig into Data Visualization and Dashboards – Miguel Martinez, Sanjay Soni & Marc Reguera

Analytics professionals can get a free sneak peek at the PASS BA Conference’s sessions and speakers with on-demand recordings from last week’s BA Marathon webinar series.

PASS BA Speakers – here come the IBM Experts plus introducing Richard Lee’s articles on Data Privacy Day 2015

Here come the IBM experts! Here’s one our PASS BA Conference speakers, Richard Lee, sharing his expertise on the topic of data privacy. Richard is a well-respected influencer and strategist in data leadership, and we are delighted to have him join us at the event. Richard Lee is on of IBM’s Honours list as a Big Data and Analytics Hero, as well as on of their Champions.

Richard joins another IBM superstar, James Kobielus, who will be talking about Big Data. James G Kobielus is an industry veteran. He serves as IBM’s big data evangelist; as senior program director for product marketing in big data analytics, and as editor-in-chief of IBM Data Magazine. Kobielus spearheads IBM’s thought leadership activities in Big Data, Hadoop, cognitive computing, enterprise data warehousing, and advanced analytics. Basically, he is an expert in a variety of facets of big data.

Personally I cannot wait for their sessions. I’ve admired their work from afar and when we were selecting speakers, they went straight onto my wishlist. I have been lucky enough to meet Richard in person recently and we had a great conversation. I’m looking forward to meeting James as well; we’ve spoken on the phone. I learned things from conversing with both of them, and I’m sure that attendees will gain inspirational and practical advice from both of them.

Richard’s session is here: and James’ session is here:


I have written two articles for this year’s Data Privacy Day (#DPD15) endeavors . One will be published in the February issue of Information Age ( and is online now (see link below) & the other is on the IBM Big Data Hub ( going live on January 28th as a feature on DPD. I encourage all of you to visit these sites;

Information Age: “Personal Privacy, Internet Commerce and National Security: Can they co-exist?” (

IBM Big Data Hub: “Some Thoughts for Privacy Day 2015” (

2015 is going to be a critical year in determining the future of Personal Privacy in all respects e.g. Protecting Students Privacy, Reigning in Data Brokers, Thwarting Cyber Attacks, Curtailing Government Surveillance and Snooping, a Refresh of the EU Data Protection Act, President Obama’s “Privacy Bill of Rights”, etc. I believe that it is essential that everyone take an agressive role…

View original post 68 more words

Jen’ Diary: Board Meeting, BA Marathon, PASS BA features on TechNetRadio!

Here’s another diary entry for my time on the PASS board. I don’t represent PASS officially here. I know it has been quiet but I am busy on PASS, work, and other community things such as SQLSaturday Edinburgh and TechDays UK. So far this year, I have had one night off! It’s 2.30am here in the UK and I’m churning out a blog, so I can’t be accused of shirking!

If you are attending Pass BA Conference and looking for a discount, please email me at and I have a discount code to give away.

I attended the PASS Board meeting in January, talking mainly about handing over the Virtual Chapters portfolio but the main focus of my time was the PASS BA Conference. We talked about our goals for the Portfolio, our progress so far, and our next steps. I also shared some marketing insights. I’m on the Program Committee, so I talked about our process for going out and getting the big names in the analytics industry – people who live and breathe analytics. It has been a lot of fun, but it has also been extremely hard work to get the right speakers. Our speaker list is here. We still have some names to release, so watch that space.

I’ve also been helping some of the sponsors with thoughts on their content and material. Hortonworks are going to put on a great session and you’ll get more details of that, in the future.

I also took part in the PASS BA Marathon. My session is here. Again, the PASS HA team did a great job of supporting this event and the sessions are extremely fresh. The speakers did an amazing job!

I was also part of the process in securing our amazing Day 2 keynote speaker, Mico Yuk. I have so many things to look forward to at PASS BA, and Mico’s session is a real highlight for me. Mico has many achievements, including one of the top 50 Analytic voices on Twitter. We are blessed to have her with us, and I can’t wait to hear her talk and to get to know her in person.

More good news! What did Microsoft’s Jen Moser, Jen Underwood and Miguel Martinez @sqljen @maikelson @idigdata say about the ‪#‎passbac‬ with TechNetRadio?

I’m so excited to be part of this opportunity to grow the business-focused data community, the data analyst, the business analyst, who works with data but doesn’t work in the IT department. It is a exciting growth in our approach and our fantastic speaker list reflects our vision about the emerging BA community and what they really want.

Miguel is correct; we worked hard to invite big name analytics speakers, and they are coming!

Listen here!

BAWeekender Roundup of PASS BA speaker activity and Quote of the week!

friday_happy_quotes_for_weekend[1]For the weekend, I thought you might enjoy some BA oriented weekend reading from our fantastic roundup of speakers at the PASS BA Conference. Here goes! This week, we are focusing on Power Bi articles in particular.

Blue Hill’s James Haight blogged on why the real promise of comes from applications But I’m particularly interested in James’ assessment of Enterprise BI and how Power BI fits in to the enterprise. This is particularly interesting because James covers all the bases – functionality, licensing, pricing and so on. Recommended read!

IBM and all around Big Data influencer James Kobielus blogged on The dev@ was in the details, and in my delivery James is a great speaker and his PASS BA Webinar for the BA Marathon was excellent and very well received. You can see it here.

Our keynote speaker, Mico Yuk of BIBrainz, tweeted out this link from Boris Evelson and his insightful discussion  on the Waterfall Methodology – Build An Organization:

It’s fair to say that the Internet went nuts over our choice of keynote speaker. We are truly delighted to have Mico along and the warm wishes are still rolling in! Here is the Chicago Times – BI Brainz Founder Mico Yuk Leading All-Star Speaker Lineup at PASS Business Analytics Conference.

Richard Lee, part of our Communicate and Lead track, is part of the #IBMDataHero hall of fame: Stories of challenges, wins & outcomes. You can see RIchard on the IBM Data Hero list here.

We end with our Quote of the Week! This comes from Bill Jelen, over Twitter. If you say you are starting with clean data, you are 100% lying” – overheard at Bill’s seminar. Amen to that!

Have a great weekend, folks!