Data Science VC follow up: Intro to R and Statistics for the Rookie Part 1

I’ve decided to hold a five week introductory statistics and R course. Here, I am sharing the slide deck and the code. The video will go up on our YouTube Data Science Virtual Chapter channel, which is accessible from here.

In the first week, we talked about the relationship between statistics and data visualisation, and how it is extremely useful to have a good grounding in both topics. The slides are here, followed by the code:

The R code can be copied and pasted into your RStudio file:

# Loads sample datasets
data()

# Let’s look at the data
# This command tells you the metadata. What does R see, when it sees ‘iris’?
str(iris)

# What are the attributes?
# This gives us more information.
attributes(iris)

# Let’s see more of the data
iris

class(iris)
# A data frame has columns which can have different types.
# The column names and types constitute the schema.
# how do we know what is in our data frame?

# Column Names
colnames(iris)

# how can we see data in one of the columns?
iris$Petal.Length
# or we could also use iris[,3] to get the same column data.

iris[,3]
# of course, we want to visualise the data.
# Let’s do a simple scatter plot.

# How can we see the first five rows?
iris[1:5,]

# how can we see the Petal Length of the first 5 rows?
iris[1:5, “Petal.Length”]

# This shows us some of the descriptive statistics of each variable
summary(iris)

table(iris$Species)

# Let’s have some dataviz fun!
plot(iris$Petal.Length, iris$Petal.Width, main=”Anderson’s Iris Data”)
# You can now see the plot appear in the right hand side frame of RStudio.
# we can make it slightly more interesting
plot(iris$Petal.Length, iris$Petal.Width, pch=23, bg=c(“orange”, “blue”, “green”) [unclass(iris$Species)], main=”Anderson’s Iris Data”)

# we can make it even more interesting
pairs(iris[1:4], main = “Anderson’s Iris Data”, pch = 23, bg = c(“orange”, “green”, “blue”)[unclass(iris$Species)])
# pie charts!
pie(table(iris$Species))

# ooh, 3D!

library(scatterplot3d)
scatterplot3d(iris$Petal.Width, iris$Sepal.Length, iris$Sepal.Width)

# ooh, even more 3D!
library(rgl)
plot3d(iris$Petal.Width, iris$Sepal.Length, iris$Sepal.Width)

#Save your work!

savehistory(“~/Topic 1 Getting familiar with R A.Rhistory”)

8 thoughts on “Data Science VC follow up: Intro to R and Statistics for the Rookie Part 1

  1. Hi Jen, Thanks for sharing the code. It might worth documenting the version of R you were using for this demo and for people who has not attend this session, they would not know that you have to install package first before referencing them using library command – Install.Packages(“yourpackage”). By the way, “bg=” argument never worked for me in the script, it might to do with my version of R (3.1.2). Do you have any idea?

    • Yibo,
      Thank you for your comment. The blog is an accompaniment to the video series. It is not a replacement. I put a lot of time and effort into creating the video, and I hope that people will appreciate the effort.
      Regards,
      Jen

  2. Pingback: Featured Blog Posts on BI - Women In Analytics - Site Home - MSDN Blogs

  3. Pingback: Featured Articles on BI - Women In Analytics - Site Home - MSDN Blogs

  4. Hi Jen,

    I enjoyed watching your Introduction to R and Statistics for the Rookie – five week series part 1 or 5 on your You Tube site.

    Is part 2 and 3 available on You Tube as well?

    I could not locate either of those 2 sessions.

    Thanks,

    Vince Neville

  5. Hi Jen, thanks for posting your notes here on the site. I’m looking forward to learning more and sharing what I find. Part 3 and 4 where can these be found?
    Thank you.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s