I’m speaking at #ITDevConnections 2018!

ITDev Connections Side Banner

I’m speaking at #ITDevConnections 2018! Join me in Dallas and learn on topics such as:

  • Blockchain Demystified for Business Intelligence Professionals
  • Data Analytics with Azure Cosmos Schema-less Data and Power BI
  • R in Power BI for Absolute Beginners

There is also a Women in Technology lunch and I’m excited about that, too!

Join me in Dallas this October!

I’m excited to see that some of my SQLFamily friends are going, such as Mindy Curnett, Kevin Kline, Bob Ward and Tim Mitchell. I’m looking forward to going to sessions as an attendee, too!

You can find the agenda here.

SPECIAL OFFER:Use promo code STIRRUP by September 7 and save on your pass!*

  • All Access Pass – $1899 with code STIRRUP (a $2699 value)
    Includes Pre-Conference workshop plus 200+ sessions and networking
  • Essentials Pass – $1199 with code STIRRUP (a $1799 value)
    Includes 200+ sessions and networking

All Access Pass – $1899 with promo code STIRRUP (a $2699 value) 

Includes everything listed in the Essentials Pass, PLUS you gain access to one Pre-Conference Workshop, where you’ll spend a full day in the classroom with our experts.

Choose from 8 workshops:

  1. Migrating to Windows 10 – Notes From the Field
  2. Mastering ASP.NET Core, Angular 6 and EF Core
  3. From Beginning to Expert SQL Programmer in One Day
  4. Progressive Web Apps From Beginner to Expert
  5. ConfigMgr and Azure – A Flexible, Powerful and Compelling Combination
  6. Practical Performance Monitoring
  7. Build Intelligent Applications with A.I. Technologies
  8. Going Serverless Using Azure

Essentials Pass – $1199 with promo code STIRRUP (a $1799 value)

Includes:

  • Mix and Match 150+ Tracks/Sessions
  • Access to Session Presentation Materials
  • Vendor Receptions
  • Networking Events
  • Breakfast and Coffee Breaks
  • Networking Luncheon
  • Conference Registration Giveaway
  • and more!

*Prices increase after September 7, 2018. Must use code to receive discount. Non-transferable. Cannot be applied to previously paid registrations.

Issues and Resolutions in starting R and R Server on SQL Server 2017

I am helping some people learn Data Science and we are having a ton of fun! There are lots of things to remember. So I am noting things here, in case I forget!

We noted the following error message, when we saw that R was not running on our SQL Server 2017 install:

‘sp_execute_external_script’ is disabled on this instance of SQL Server. Use sp_configure ‘external scripts enabled’ to enable it.

Here is the longer version:

Msg 39023, Level 16, State 1, Procedure sp_execute_external_script, Line 1 [Batch Start Line 3]

‘sp_execute_external_script’ is disabled on this instance of SQL Server. Use sp_configure ‘external scripts enabled’ to enable it.

Msg 11536, Level 16, State 1, Line 4

EXECUTE statement failed because its WITH RESULT SETS clause specified 1 result set(s), but the statement only sent 0 result set(s) at run time.

Grr! What’s happened here? We had installed R as part of the SQL installation, and we had run the command to enable it, too. In case you are wondering, here is the command:

EXEC sp_configure ‘external scripts enabled’, 1
RECONFIGURE WITH OVERRIDE

So what happens next? Initial things to check:

Is R Server installed properly along with SQL Server? Here are some guidelines to help you.

Is the Launchpad service running? One of my colleagues and friends Tomaž Kaštrun  wrote a nice article on SQL Server Central. If not, this could be due to a lack of permissions in being able to start the service.

Did you restart the MSSQL Service on the machine? This will also restart the Launchpad service as well. If you didn’t restart the service, you will need to do that so it can pick up the results.

Once R is running properly, you can check it by using the following command, borrowed from the official installation guide over at Microsoft:

EXEC sp_execute_external_script @language =N’R’,
@script=N’
OutputDataSet <- InputDataSet;
‘,
@input_data_1 =N’SELECT 1 AS RIsWorkingFine’
WITH RESULT SETS (([RIsWorkingFine] int not null));
GO

If that returns a 1, then you are all set! To prove it works properly, you can retrieve the world famous Iris dataset using the following command, borrowed from the official documentation on sp_execute_external_script:

DROP PROC IF EXISTS get_iris_dataset;

go

CREATE PROC get_iris_dataset

AS BEGIN

EXEC sp_execute_external_script @language = N‘R’ , @script = N‘iris_data <- iris;’ , @input_data_1 = N , @output_data_1_name = N‘iris_data’ WITH RESULT SETS ((“Sepal.Length” float not null, “Sepal.Width” float not null, “Petal.Length” float not null, “Petal.Width” float not null, “Species” varchar(100)));

END;

GO

Once you’ve created the command, execute the following SQL command and you will see the iris dataset:

exec get_iris_dataset

You’re all set! Enjoy R!

Five reasons to be excited about Microsoft Data Insights Summit!

ms-datainsights_linkedin-1200x627-1

I’m delighted to be speaking at Microsoft Data Summit! I’m pumped about my session, which focuses on Power BI for the CEO. I’m also super happy to be attending the Microsoft Data Summit for five top reasons (and others, but five is a nice number!). I’m excited about all of the Excel, Power BI, DAX and Data Science goodies. Here are some sample session titles:

Live Data Streaming in Power BI

Data Science for Analysts

What’s new in Excel

Embed R in Power BI

Spreadsheet Management and Compliance (It is a topic that keeps me up at night!)

Book an in-person appointment with a Microsoft expert with the online Schedule Builder. Bring your hard – or easy – questions! In itself, this is a real chance to speak to Microsoft directly and get expert, indepth  help from the team who make the software that you love.

Steven Levitt of Freakonomics is speaking and I’m delighted to hear him again. I’ve heard him present recently and he was very funny whilst also being insightful. I think you’ll enjoy his session. You’ll know him from Freakonomics.

freakonomics

I’m excited that James Phillips is delivering a keynote! I have had the pleasure of meeting him a few times and I am really excited about where James and the Power BI team have taken Power BI. I’m sure that there will be good things as they steam ahead, so James’ keynote is unmissable!

Alberto Cairo is presenting a keynote! Someone who always makes me sit up a bit straighter when they tweet is Alberto Cairo, and I’m delighted he’s attending. I hope I can get to meet him in person. Whether Alberto is tweeting about data visualisation, design or the world in general, it’s always insightful. I have his latest book and I hope I can ask him to sign it.

003b6f66

Tons of other great speakers! Now someone I haven’t seen for ages – too long in fact – is Rob Collie. Rob is President of PowerPivotPro and you simply have to hear him speak on the topic. He’s direct in explaining how things work, and you will learn from him. I’m glad to see Marco Russo is speaking and I love his sessions. In fact, at TechEd North America, I only got to see one session because I was so busy with presenting, booth duty etc… but I managed to get to see a session and I made sure it was Marco Russo and Alberto Ferrari’s session.  Chris Webb is also presenting and his sessions are always amazing. I have to credit Chris in part for where I am today, because his blog kept me sane and his generosity during sessions meant that I never felt stupid asking him questions. I’m learning too – always.

Ok, that’s five things but there are plenty more. Why not see for yourself?

Join me at the conference, June 12–13, 2017 in Seattle, WA — and be sure to sign up for your 1:1 session with a Microsoft expert.

PASS Business Analytics Day, Jan 11, Chicago

pass-ba-day

PASS’ first Business Analytics Day, which will be held in Chicago on January 11, 2017. You can choose one of two full-day, in-depth sessions for $595: In-Database Analytics with R and SQL Server 2016 and Mastering Power BI Solutions.

These are unique learning opportunities to get more advanced in R or data visualization with Power BI. And as with other PASS events, the goal is to allow you to walk away with real-world analytics knowledge that you can use immediately!

PASS Business Analytics Day

You have two great choices: In-Database Analytics with R and SQL Server 2016 and Mastering Power BI Solutions.

In-Database Analytics with R and SQL Server 2016

With Microsoft SQL Server 2016, data scientists can run in-database analytics using R. This is a “best of both worlds” scenario: delegate database management to SQL Server whilst you create analytics and visualisations in R and Power BI. In this session, we will cover the overall architecture of SQL R Services and go over some best practices. We will look at best practices in analytics and visualisations with a focus on R, and then we delve more in-depth into some practical common use-cases.

Speakers:
David Smith, R Community Lead at Revolution Analytics, a Microsoft Company
Seth Mottaghinejad, Data Scientist, Microsoft

Mastering Power BI Solutions

In this Power BI hands-on Workshop, you will master the “power” of Power BI. Learn to use self-service and enterprise-scale Power BI capabilities; gain valuable skills to integrate, wrangle, shape and visualize data for analysis. Beginning and intermediate level users will learn to address data and reporting challenges with advanced design techniques.

Speaker:
Paul Turley, Mentor with SolidQ, BI Architect, and Microsoft Data Platform MVP

Date: January 11, 2017

Location: Microsoft Technology Center, #200 – 200 East Randolph Drive, Chicago, IL.

We hope you’ll join us!

Learning pathway for SQL Server 2016 and R Part 2: Divided by a Common Language

http://whatculture.com/film/the-office-uk-vs-the-office-us.php

Britain has “really everything in common with America nowadays, except, of course, language.” Said Oscar Wilde, in the Centerville Ghost (1887) whilst George Bernard Shaw is quoted as saying that the “The United States and Great Britain are two countries separated by a common language.”

There are similarities and differences between SQL and R, which might be confusing. However, I think it can be illuminating to understand these similarities and differences since it tells you something about each language. I got this idea from one of the attendees at PASS Summit 2015 and my kudos and thanks go to her. I’m sorry I didn’t get  her name, but if you see this you will know who you are, so please feel free to leave a comment so that I can give you a proper shout out.

If you are looking for an intro to R from the Excel perspective, see this brilliant blog here. Here’s a list onto get us started. If you can think of any more, please give me a shout and I will update it. It’s just an overview and it’s to help the novice get started on a path of self-guided research into both of these fascinating topics.

R SQL / BI background
A Factor has special properties; it can represent a categorical variable, which are used in linear regression, ANOVA etc. It can also be used for grouping. A Dimension is a way of describing categorical variables. We see this in the Microsoft Business Intelligence stack.
in R, dim means that we can give a chunk of data dimensions, or, in other words, give it a size. You could use dim to turn a list into a matrix, for example Following Kimball methodology, we tend to prefix tables as dim if they are dimension tables. Here, we mean ‘dimensions’ in the Kimball sense, where a ‘dimension’ is a way of describing data. If you take a report title, such as Sales by geography, then ‘geography’ would be your dimension.
R memory management can be confusing. Read Matthew Keller’s excellent post here. If you use R to look at large data sets, you’ll need to know
– how much memory an object is taking;
– 32-bit R vs 64-bit R;
– packages designed to store objects on disk, not RAM;
– gc() for memory garbage collection
– reduce memory fragmentation.
SQL Server 2016 CTP3 brings native In-database support for the open source R language. You can call both R, RevoScaleR functions and scripts directly from within a SQL query. This circumvents the R memory issue because SQL Server benefits the user, by introducing multi-threaded and multi-core in-DB computations
Data frame is a way of storing data in tables. It is a tightly coupled collections of variables arranged in rows and columns. It is a fundamental data structure in R. In SQL SSRS, we would call this a data set. In T-SQL, it’s just a table. The data is formatted into rows and columns, with mixed data types.
All columns in a matrix must have the same mode(numeric, character, and so on) and the same length. A matrix in SSRS is a way of displaying, grouping and summarizing data. It acts like a pivot table in Excel.
 <tablename>$<columnname> is one way you can call a table with specific reference to a column name.  <tablename>.<columname> is how we do it in SQL, or you could just call the column name on its own.
To print something, type in the variable name at the command prompt. Note, you can only print items one at a time, so use cat to combine multiple items to print out. Alternatively, use the print function. One magic feature of R is that it knows magically how to format any R value for printing e.g.

print(matrix(c(1,2,3,5),2,2))

PRINT returns a user-defined message to the client. See the BOL entry here. https://msdn.microsoft.com/en-us/library/ms176047.aspx

CONCAT returns a string that is the result of concatenating two or more string values. https://msdn.microsoft.com/en-GB/library/hh231515.aspx

Variables allow you to store data temporarily during the execution of code. If you define it at the command prompt, the variable is contained in your workspace. It is held in memory, but it can be saved to disk. In R, variables are dynamically typed so you can chop and change the type as you see fit. Variables are declared in the body of a batch or procedure with the DECLARE statement and are assigned values by using either a SET or SELECT statement. Variables are not dynamically typed, unlike R. For in-depth look at variables, see Itzik Ben-Gan’s article here.
ls allows you to list the variables and functions in your workspace. you can use ls.str to list out some additional information about each variable. SQL Server has tables, not arrays. It works differently, and you can find a great explanation over at Erland Sommarskog’s blog. For SQL Server 2016 specific information, please visit the Microsoft site.
A Vector is a key data structure in R, which has tons of flexibility and extras. Vectors can’t have a mix of data types, and they are created using the c(…) operator. If it is a vector of vectors, R makes them into a single vector. Batch-mode execution is sometimes known as vector-based or vectorized execution. It is a query processing method in which queries process multiple rows together. A popular item in SQL Server 2016 is Columnstore Indexes, which uses batch-mode execution. To dig into more detail, I’d recommend Niko Neugebauer’s excellent blog series here, or the Microsoft summary.

There will be plenty of other examples, but I hope that helps for now.

Learning pathway for SQL Server 2016 and R Part 1: Installation and configuration

Jargogled is an archaic word for getting confused or mixed up. I’m aware that there are lots of SQL Server folks out there, who are desperate to learn R, but might be jaRgogled by R. Now that R is in SQL Server, it seems like the perfect opportunity to start a new blog series to help people to splice the two great technologies together. So here you are!

First up, what do you need to know about SQL Server installation with R? The installation sequence is well documented here. However, if you want to make sure that the R piece is installed, then you will need to make sure that you do one thing: tick the Advanced Analytics Extension box.

SQL Server 2016 R Feature Selection

You need to select ‘Advanced Analytics Extensions’, which you will find under ‘Instance Features’. Once you’ve done that, you are good to proceed with the rest of your installation.

Once SQL Server is installed, let’s get some data into a SQL Server database. Firstly, you’ll need to create a test database, if you don’t have one already. You can find some information on database creation in SQL Server 2016 over at this Microsoft blog. You can import some data very quickly and there are different ways of importing data. If you need more information on this, please read this Microsoft blog.

If you fancy taking some sample data, try out the UCI Machine Learning data repository. You can download some data from there, following the instructions on that site, and then pop it into SQL Server.

If you have Office x64 installed on your machine, you might run into an issue:

Microsoft.ACE.OLEDB.15.0′ provider is not registered on the local machine

I ran into this issue when I tried to import some data into SQL Server using the quick and dirty ‘import data’ menu item in SSMS. After some fishing around, I got rid of it by doing the following:

There are other ways of importing data, of course, but I wanted to play with R and SQL Server, and not spend a whole chunk of time importing data.

In our next tutorial, we will look at some of the vocabulary for R and SQL Server which can look confusing for people from both disciplines. Once you learn the terminology, then you’ll see that you already know a lot of the concepts in R from your SQL Server and Business Intelligence expertise. That expertise will help you to springboard to R expertise, which is great for your career.

Convincing your HiPPO at EARL Conference in London!

ID-100257254I’m delighted to be speaking at the EARL Conference to be held in London on the 14th – 16th September. What’s my topic?

Convince your HiPPO with Real world Data Storytelling in R and Machine Learning

In a world where the HiPPO’s (Highest Paid Person’s Opinion) is final, how can we use technology to drive the organisation towards data-driven decision making as part of their organizational DNA? R provides a range of functionality in machine learning, but we need to expose its richness in a world where it is made accessible to decision makers. Using Data Storytelling with R, we can imprint data in the culture of the organization by making it easily accessible to everyone, including decision makers. Together, the insights and process of machine learning are combined with data visualisation to help organisations derive value and insights from big and little data.

In this session, we will use R and cloud-based technologies in order to explore and analyse data using machine learning and statistical packages functionality, and we will look at our results. Then, we will look at how we disseminate the results to the HiPPO audience, using best practices in data visualisation and R, informed by gurus such as Stephen Few and Edward Tufte.

If you want to know how to demystify R and the insights you’ve found during your analyses, join this session in order to learn about machine learning as a technology and a discipline, and how to make the most of your insights using best practice data visualisation. Using real-life scenarios, this session will help you to communicate the insights of your data to your HiPPO, thereby helping to move your organisation towards a data-driven culture.

R now ranks as the sixth most popular programming language  – its move from last year’s 9th place reflecting the growing importance of data analytics to an increasing number of industries and sectors.  EARL offers a unique opportunity to discover how R is being used commercially to provide a wealth of business solutions.

EARL London will feature :

  • Presentations from over 40 R gurus and Business leaders
  • Speakers represent a broad range of industry sectors  – including:  insurance, manufacturing, customer analytics, life sciences, finance etc
  • Sessions include:  Data Visualisation, Business Challenges, Big Data Technologies, Modelling, Workflow and Commercial Applications
  • Keynote speakers:  Alex Bellow, Dirk Eddelbuettel, Joe Cheng and Hannah Fry
  • Speakers representing Companies such as Shell, KPMG, AstraZeneca, Lloyd’s of London, UBS and Hewlett Packard
  • Pre Conference workshops on:  Interactive Reporting with R Markdown and Shiny (now SOLD OUT), An Introduction to Rcpp,  Integrating R and Python, and Current Best Practices in Formal Package Development
  • Sponsors and Exhibitors including Revolution Analytics, RStudio, Hewlett Packard, Teradata, Oracle, Harnham UK, Plotly, Tessella, Information Builders and O’Reilly
  • Sensational central London venue
  • 3 Conference Networking events including the Main Conference Reception in the amazing walkways of London’s iconic Tower Bridge

If you have yet to purchase your ticket please don’t delay to avoid disappointment. Tickets can be purchased online via credit card or can be invoiced if required. Group discounts are available to companies sending >5 attendees – please email earl-team@mango-solutions.com for more information.