Issues and Resolutions in starting R and R Server on SQL Server 2017

I am helping some people learn Data Science and we are having a ton of fun! There are lots of things to remember. So I am noting things here, in case I forget!

We noted the following error message, when we saw that R was not running on our SQL Server 2017 install:

‘sp_execute_external_script’ is disabled on this instance of SQL Server. Use sp_configure ‘external scripts enabled’ to enable it.

Here is the longer version:

Msg 39023, Level 16, State 1, Procedure sp_execute_external_script, Line 1 [Batch Start Line 3]

‘sp_execute_external_script’ is disabled on this instance of SQL Server. Use sp_configure ‘external scripts enabled’ to enable it.

Msg 11536, Level 16, State 1, Line 4

EXECUTE statement failed because its WITH RESULT SETS clause specified 1 result set(s), but the statement only sent 0 result set(s) at run time.

Grr! What’s happened here? We had installed R as part of the SQL installation, and we had run the command to enable it, too. In case you are wondering, here is the command:

EXEC sp_configure ‘external scripts enabled’, 1
RECONFIGURE WITH OVERRIDE

So what happens next? Initial things to check:

Is R Server installed properly along with SQL Server? Here are some guidelines to help you.

Is the Launchpad service running? One of my colleagues and friends Tomaž Kaštrun  wrote a nice article on SQL Server Central. If not, this could be due to a lack of permissions in being able to start the service.

Did you restart the MSSQL Service on the machine? This will also restart the Launchpad service as well. If you didn’t restart the service, you will need to do that so it can pick up the results.

Once R is running properly, you can check it by using the following command, borrowed from the official installation guide over at Microsoft:

EXEC sp_execute_external_script @language =N’R’,
@script=N’
OutputDataSet <- InputDataSet;
‘,
@input_data_1 =N’SELECT 1 AS RIsWorkingFine’
WITH RESULT SETS (([RIsWorkingFine] int not null));
GO

If that returns a 1, then you are all set! To prove it works properly, you can retrieve the world famous Iris dataset using the following command, borrowed from the official documentation on sp_execute_external_script:

DROP PROC IF EXISTS get_iris_dataset;

go

CREATE PROC get_iris_dataset

AS BEGIN

EXEC sp_execute_external_script @language = N‘R’ , @script = N‘iris_data <- iris;’ , @input_data_1 = N , @output_data_1_name = N‘iris_data’ WITH RESULT SETS ((“Sepal.Length” float not null, “Sepal.Width” float not null, “Petal.Length” float not null, “Petal.Width” float not null, “Species” varchar(100)));

END;

GO

Once you’ve created the command, execute the following SQL command and you will see the iris dataset:

exec get_iris_dataset

You’re all set! Enjoy R!

Learning pathway for SQL Server 2016 and R Part 2: Divided by a Common Language

http://whatculture.com/film/the-office-uk-vs-the-office-us.php

Britain has “really everything in common with America nowadays, except, of course, language.” Said Oscar Wilde, in the Centerville Ghost (1887) whilst George Bernard Shaw is quoted as saying that the “The United States and Great Britain are two countries separated by a common language.”

There are similarities and differences between SQL and R, which might be confusing. However, I think it can be illuminating to understand these similarities and differences since it tells you something about each language. I got this idea from one of the attendees at PASS Summit 2015 and my kudos and thanks go to her. I’m sorry I didn’t get  her name, but if you see this you will know who you are, so please feel free to leave a comment so that I can give you a proper shout out.

If you are looking for an intro to R from the Excel perspective, see this brilliant blog here. Here’s a list onto get us started. If you can think of any more, please give me a shout and I will update it. It’s just an overview and it’s to help the novice get started on a path of self-guided research into both of these fascinating topics.

R SQL / BI background
A Factor has special properties; it can represent a categorical variable, which are used in linear regression, ANOVA etc. It can also be used for grouping. A Dimension is a way of describing categorical variables. We see this in the Microsoft Business Intelligence stack.
in R, dim means that we can give a chunk of data dimensions, or, in other words, give it a size. You could use dim to turn a list into a matrix, for example Following Kimball methodology, we tend to prefix tables as dim if they are dimension tables. Here, we mean ‘dimensions’ in the Kimball sense, where a ‘dimension’ is a way of describing data. If you take a report title, such as Sales by geography, then ‘geography’ would be your dimension.
R memory management can be confusing. Read Matthew Keller’s excellent post here. If you use R to look at large data sets, you’ll need to know
– how much memory an object is taking;
– 32-bit R vs 64-bit R;
– packages designed to store objects on disk, not RAM;
– gc() for memory garbage collection
– reduce memory fragmentation.
SQL Server 2016 CTP3 brings native In-database support for the open source R language. You can call both R, RevoScaleR functions and scripts directly from within a SQL query. This circumvents the R memory issue because SQL Server benefits the user, by introducing multi-threaded and multi-core in-DB computations
Data frame is a way of storing data in tables. It is a tightly coupled collections of variables arranged in rows and columns. It is a fundamental data structure in R. In SQL SSRS, we would call this a data set. In T-SQL, it’s just a table. The data is formatted into rows and columns, with mixed data types.
All columns in a matrix must have the same mode(numeric, character, and so on) and the same length. A matrix in SSRS is a way of displaying, grouping and summarizing data. It acts like a pivot table in Excel.
 <tablename>$<columnname> is one way you can call a table with specific reference to a column name.  <tablename>.<columname> is how we do it in SQL, or you could just call the column name on its own.
To print something, type in the variable name at the command prompt. Note, you can only print items one at a time, so use cat to combine multiple items to print out. Alternatively, use the print function. One magic feature of R is that it knows magically how to format any R value for printing e.g.

print(matrix(c(1,2,3,5),2,2))

PRINT returns a user-defined message to the client. See the BOL entry here. https://msdn.microsoft.com/en-us/library/ms176047.aspx

CONCAT returns a string that is the result of concatenating two or more string values. https://msdn.microsoft.com/en-GB/library/hh231515.aspx

Variables allow you to store data temporarily during the execution of code. If you define it at the command prompt, the variable is contained in your workspace. It is held in memory, but it can be saved to disk. In R, variables are dynamically typed so you can chop and change the type as you see fit. Variables are declared in the body of a batch or procedure with the DECLARE statement and are assigned values by using either a SET or SELECT statement. Variables are not dynamically typed, unlike R. For in-depth look at variables, see Itzik Ben-Gan’s article here.
ls allows you to list the variables and functions in your workspace. you can use ls.str to list out some additional information about each variable. SQL Server has tables, not arrays. It works differently, and you can find a great explanation over at Erland Sommarskog’s blog. For SQL Server 2016 specific information, please visit the Microsoft site.
A Vector is a key data structure in R, which has tons of flexibility and extras. Vectors can’t have a mix of data types, and they are created using the c(…) operator. If it is a vector of vectors, R makes them into a single vector. Batch-mode execution is sometimes known as vector-based or vectorized execution. It is a query processing method in which queries process multiple rows together. A popular item in SQL Server 2016 is Columnstore Indexes, which uses batch-mode execution. To dig into more detail, I’d recommend Niko Neugebauer’s excellent blog series here, or the Microsoft summary.

There will be plenty of other examples, but I hope that helps for now.

Jen’s Diary: Overcoming the Power of Feathers through Action

I’m sorry I haven’t kept this diary up to date: I’ve been at SQLBits, SQLPass Nordic, Data Culture events, and other community gatherings. I’ve also had dental surgery, involving the removal of two teeth and the removal of some of my lower jaw. I haven’t been very happy, needless to say.

As always, I don’t represent PASS or any other organisation throughout this blog.

FYI I’ll be holding a Twitter Surgery Hour on Friday 20th March at 12pm GMT so please tweet me at @jenstirrup and ask me whatever you like! If you are not sure what time that is in your time zone, please check here.

never+not+broken[1]I’m not Hindi, but I’m inspired by the HIndu goddess Akhilanda, which means essentially “never not broken.” In other words, The Always Broken Goddess. Sanskrit is a tricky but amazing language. Here, we see a double negative here means that Akhilanda is broken right down to her name. The thing is, being broken is actually part of a renewal process, and it means that your broken parts can shine out more brightly than ever before – simply because you are broken, and moving, and reflect light out wide. It means you can pick up your pieces, and run without limits.

Outwardly, I am not an obvious leader. I realise I am a quiet person. I don’t party. I very rarely drink alcohol. I am not much of a dinner date – I am a forty something single mother, with no real hobbies or interests other than technology. I have never been someone to write home about. I am my own person.

People can know you by your reputation, but your actions can speak louder than words. Let me give you an example: a Jewish tale talks about a man, who went about the community telling gossip about the rabbi. Later, he realized the wrong he had done, and began to feel remorse. He went to the rabbi and wanted to apologise, saying he would do anything he could to make amends. The rabbi told the man, “Take a feather pillow, cut it open, and scatter the feathers to the winds.” When he returned to tell the rabbi that he had done it, the rabbi said, “Now, go and gather the feathers. Because you can no more make amends for the damage your words have done than you can recollect the feathers.”

Why do I write this blog? I want to give people something other than feathers, what is said about me, whether it is good or bad. Instead, I want to demonstrate real actions that prove me, demonstrate unequivocally that I work for the community, and work hard. Instead, I hope that these diary post series will overcome the power of feathers – whether they are good or bad – and shine out what I actually do for the community. Actions speak louder than words, and this is how we overcome the power of feathers – let your light shine more brightly than ever, even if you are not perfect but broken. For me, I work really hard for the community and hope that the actions demonstrate that.

I’m hoping that people will see how hard I’ve worked for the community, and they will hold onto that data – we are in a data-driven decade, right? – and they will see the data for what it is. So, what have I been doing?

Ongoing – PASS BA Conference. I attend conference calls for about three hours a week, some weeks, up to seven hours, and these calls are held from 9pm my time onwards. Obviously, that takes out a few of my evenings a week. I obviously have work to to outside of this time, such as sponsorship, blogging on numerous occasions, email.

Ongoing – I have been holding PASS Data Science Virtual Chapter webinars every Friday night, 9pm my time. This obviously takes out my Friday night, in additional to the BAC calls I mention above, and I have been doing this for four consecutive weeks now. This week is the last week, and I’m considering doing a five week introductory course to Python and/or AzureML after that. I’ve held lots of Virtual Chapter meetings before, particularly since I held that Portfolio for PASS last year. My most popular sessions were on data visualisation, personally, but I have noted that my audience has really increased over the last five weeks and I’d like to continue to offer that community support.

March 2015 – I tried to hold another DiTBits (Diversity in Technology) event at SQLBits, but, alas, this wasn’t to be. The event wasn’t marketed and it seemed a pity to try and shoehorn something in at the last moment. Next time, I hope.

March 2015 – I held a two hour R and Python for SQL and Business Intelligence Professionals session at SQLBits. Files and notes to follow (see earlier note about my surgery!) It was extremely well attended.

March 2015 – I held a general session on Data Visualisation, entitled Eye Vegetables and Eye Candy: How to Visualise your Data at SQLBits. Again, notes to follow (I’m still recovering from surgery…)

March 2015 – I did half of a joint session at SQLBits with Allan Mitchell, where I talked about Kibana. The topic was Building Scaleable Analytical Solutions on Azure and it was fun presenting with Allan.

March 2015 – I spoke at SQLRally Nordic on Pulling back the green curtain: Data Forensics, Power Bi and Dataviz.

March 2015 – I held a Women in Technology event at SQLRally Nordic, talking about the importance of diverse teams.

March 2015 – I interviewed Mico Yuk as part of our ‘Meet the Experts’ series and you can listen to the podcast here. I’m really looking forward to meeting her!

Feb 2015 – Ongoing – I worked to help get the Data Science Virtual Chapter off the ground, led by Mark Tabladillo, who is the VC lead. This has involved groundwork, phone calls, and working with Microsoft to get speakers. You should really register for the Data Science VC. It’s great fun!

Feb 2015 – Data Culture – I was the keynote speaker for Microsoft, which was great fun! The slideshare is below. My section is about the middle third or so; credits are on the slides for the other speakers.

Feb 2015 – Ongoing – I kicked off my series Intro to R series of webinars for the Data Science Virtual Chapter
Feb 2015 – I ramped up my SQLSaturday Edinburgh in earnest, which has a Business Intelligence focus.
Feb 2015 – I held an Excel BI VC session on cubes and Excel. I also hosted another session as a mentor.
Feb 2015 – Techdays – I held a webinar on AzureML to over six thousand people. Nervous? Yes, bordering on terror. Great thanks to Andrew Fryer (I’m honoured to call him my friend) for all of his support and he’s an inspiration to me, and I was glad to be his presenting small person for the day.
Feb 2015 – Hants UG – on the same day that I held the Techdays session, I left and travelled to Hampshire to deliver a session on AzureML.
I hope it’s clear that, in the past few months, I have been doing a lot of work for the community as well as my PASS Board responsibilities.
So, if you catch a feather and want to ask me about it, my email is jen.stirrup@datarelish.com or tweet me during my Twitter Hour on the 20th March. I look forward to your input. Just go ahead and ask me!

Data Science VC follow up: Intro to R and Statistics for the Rookie Part 1

I’ve decided to hold a five week introductory statistics and R course. Here, I am sharing the slide deck and the code. The video will go up on our YouTube Data Science Virtual Chapter channel, which is accessible from here.

In the first week, we talked about the relationship between statistics and data visualisation, and how it is extremely useful to have a good grounding in both topics. The slides are here, followed by the code:

The R code can be copied and pasted into your RStudio file:

# Loads sample datasets
data()

# Let’s look at the data
# This command tells you the metadata. What does R see, when it sees ‘iris’?
str(iris)

# What are the attributes?
# This gives us more information.
attributes(iris)

# Let’s see more of the data
iris

class(iris)
# A data frame has columns which can have different types.
# The column names and types constitute the schema.
# how do we know what is in our data frame?

# Column Names
colnames(iris)

# how can we see data in one of the columns?
iris$Petal.Length
# or we could also use iris[,3] to get the same column data.

iris[,3]
# of course, we want to visualise the data.
# Let’s do a simple scatter plot.

# How can we see the first five rows?
iris[1:5,]

# how can we see the Petal Length of the first 5 rows?
iris[1:5, “Petal.Length”]

# This shows us some of the descriptive statistics of each variable
summary(iris)

table(iris$Species)

# Let’s have some dataviz fun!
plot(iris$Petal.Length, iris$Petal.Width, main=”Anderson’s Iris Data”)
# You can now see the plot appear in the right hand side frame of RStudio.
# we can make it slightly more interesting
plot(iris$Petal.Length, iris$Petal.Width, pch=23, bg=c(“orange”, “blue”, “green”) [unclass(iris$Species)], main=”Anderson’s Iris Data”)

# we can make it even more interesting
pairs(iris[1:4], main = “Anderson’s Iris Data”, pch = 23, bg = c(“orange”, “green”, “blue”)[unclass(iris$Species)])
# pie charts!
pie(table(iris$Species))

# ooh, 3D!

library(scatterplot3d)
scatterplot3d(iris$Petal.Width, iris$Sepal.Length, iris$Sepal.Width)

# ooh, even more 3D!
library(rgl)
plot3d(iris$Petal.Width, iris$Sepal.Length, iris$Sepal.Width)

#Save your work!

savehistory(“~/Topic 1 Getting familiar with R A.Rhistory”)