The Prodigal Developers Return: SQL Server 2016 SP1 brings consistent programming surface to Developers and ISVs

Big news from Microsoft Connect() 2016 online developer conference. SQL Server 2016 Service Pack 1 is dropping. Download SQL Server 2016 SP 1 here.

SQL Server 2016 SP1  means lots of wider features for lower editions. Most importantly, developers and partners can now build to a single application programming surface to create or upgrade new intelligent applications and use the edition which scales to the application’s needs.

The long version and my ‘take’ on this news:

I’m incredibly impressed with Microsoft right now. I think it’s incredibly smart, actually, because they are bringing developers and ISVs back into SQL Server Land again. So, developers, ISVs, go and grab yourself a coffee and let’s have a chat.

stocksnap_3tj6nctirt

Credit: stocksnap.io

SQL Server 2016 SP1 makes leading innovation available to any developer. Microsoft is making it easier for developers to benefit from the industry-leading innovations in SQL Server for more of their applications. With SQL Server 2016 SP1 is making key innovations more accessible to customers across editions. Developers and partners can now build to a single application programming surface to create or upgrade new intelligent applications and use the edition which scales to the application’s needs. SQL Server Enterprise continues to offer the highest levels of scale, performance and availability for enterprise workloads. For more information, please see the full press announcement on the SQL Server Blog. Visual Studio Code extension for SQL and updated connectors and tools are also exciting news, because it means that it’s easier to develop with other languages, in a more streamlined fashion.

What problem are Microsoft trying to fix?

stocksnap_vlhyvv3xu5Previously, the issue with developing applications for SQL Server is that there is a disparity across editions, which can affect how your application runs.  Until now, developers have used the SQL Server development version as it will allows them to develop with features that are available on all of the production versions.

Now, the problem is solved – developers can take advantage of the programmability feature by using the same code base, and things are simpler because the customer chooses which edition they use.

The problem was evident, when you use, say, an enterprise-only feature in development but have only a Standard-edition instance in Production. You can see the full list of features and editions published by Microsoft here ‘Features Supported by the Editions of SQL Server 2016’

If you had an app that can manage Enterprise edition then it can, in principle, also manage every other edition.  However, now the application would scale to the customer’s edition, thereby streamlining the whole process.

New Tools for the Toolbox, No Pricing Changes

stocksnap_kd30xpqr0a

So, developers wouldn’t have to build complexity, but they’d have to create their app the right way. For example, there’s not always a need to scale out. Let’s take Stack Overflow, one of the top 50 busiest sites in the world.  Stack Overflow runs on Microsoft SQL Server.

Not many people know it, but there is a StackOverflow Enterprise Edition. It means that companies like StackOverflow can take advantage of the new programmability features, if they so wished. I wonder what ISVs will do?

Freedom from Constraints

Let’s examine the issue in more detail. Let’s take a look at the SQL Server editions that are available to us:

  • Azure database + Amazon RDS
  • Containerized version of any edition
  • Developer Edition
  • Express Edition
  • Enterprise Edition
  • LocalDb
  • Standard Edition
  • Web Edition

You can see why it starts to get confusing, and developers might start to look at MySQL or Postgres as alternatives.

How can you get SQL Server 2016 SP1?

I believe that this will be a primary driver for SQL Server 2016 Service Pack 1, Download SQL Server 2016 SP 1 here.

Why are Microsoft doing this?

stocksnap_kikhw5nc6yIt’s a huge benefit for ISVs. It’s my opinion that Microsoft had lost the way with their partners. Customers started to look sideways at other vendors to fulfil their needs, such as Tableau. In response, partners expanded their toolkit in order to include crème de la crème vendors such as Tableau in order to build solutions. I think that this move is a gesture to the ISVs, since it will remove friction when they choose to develop solutions.

Being pals with Open Source but better – you get what you pay for. With the advent of open source, developers have got  more choice than ever before. It’s good to bring them back to SQL Server. Postgres doesn’t have in-memory capability, for example – it has “running with scissors” mode whereby you switch off all the disk storage features. Sound scary? Yes… the clue is in the name. SQL Server brings this feature to the party, and more. ISVs can feel more confident developing on a robust solution.

Increased productivity – it removes an obstacle to development, support and deployment.

The Prodigal Developers Return

This solution means that Microsoft SQL Server is back on the table for many developers, who may have started eyeing MySQL and Postgres for this reason.

To summarise, I think that this is a smart move and I’m excited to see that the ‘voice of the developer’ has come back into SQL Server Land. It’s also a huge benefit for ISV partners, and let’s see how they democratize their data in new and exciting applications. Let’s look for more exciting things coming from Microsoft.

Learning pathway for SQL Server 2016 and R Part 2: Divided by a Common Language

http://whatculture.com/film/the-office-uk-vs-the-office-us.php

Britain has “really everything in common with America nowadays, except, of course, language.” Said Oscar Wilde, in the Centerville Ghost (1887) whilst George Bernard Shaw is quoted as saying that the “The United States and Great Britain are two countries separated by a common language.”

There are similarities and differences between SQL and R, which might be confusing. However, I think it can be illuminating to understand these similarities and differences since it tells you something about each language. I got this idea from one of the attendees at PASS Summit 2015 and my kudos and thanks go to her. I’m sorry I didn’t get  her name, but if you see this you will know who you are, so please feel free to leave a comment so that I can give you a proper shout out.

If you are looking for an intro to R from the Excel perspective, see this brilliant blog here. Here’s a list onto get us started. If you can think of any more, please give me a shout and I will update it. It’s just an overview and it’s to help the novice get started on a path of self-guided research into both of these fascinating topics.

R SQL / BI background
A Factor has special properties; it can represent a categorical variable, which are used in linear regression, ANOVA etc. It can also be used for grouping. A Dimension is a way of describing categorical variables. We see this in the Microsoft Business Intelligence stack.
in R, dim means that we can give a chunk of data dimensions, or, in other words, give it a size. You could use dim to turn a list into a matrix, for example Following Kimball methodology, we tend to prefix tables as dim if they are dimension tables. Here, we mean ‘dimensions’ in the Kimball sense, where a ‘dimension’ is a way of describing data. If you take a report title, such as Sales by geography, then ‘geography’ would be your dimension.
R memory management can be confusing. Read Matthew Keller’s excellent post here. If you use R to look at large data sets, you’ll need to know
– how much memory an object is taking;
– 32-bit R vs 64-bit R;
– packages designed to store objects on disk, not RAM;
– gc() for memory garbage collection
– reduce memory fragmentation.
SQL Server 2016 CTP3 brings native In-database support for the open source R language. You can call both R, RevoScaleR functions and scripts directly from within a SQL query. This circumvents the R memory issue because SQL Server benefits the user, by introducing multi-threaded and multi-core in-DB computations
Data frame is a way of storing data in tables. It is a tightly coupled collections of variables arranged in rows and columns. It is a fundamental data structure in R. In SQL SSRS, we would call this a data set. In T-SQL, it’s just a table. The data is formatted into rows and columns, with mixed data types.
All columns in a matrix must have the same mode(numeric, character, and so on) and the same length. A matrix in SSRS is a way of displaying, grouping and summarizing data. It acts like a pivot table in Excel.
 <tablename>$<columnname> is one way you can call a table with specific reference to a column name.  <tablename>.<columname> is how we do it in SQL, or you could just call the column name on its own.
To print something, type in the variable name at the command prompt. Note, you can only print items one at a time, so use cat to combine multiple items to print out. Alternatively, use the print function. One magic feature of R is that it knows magically how to format any R value for printing e.g.

print(matrix(c(1,2,3,5),2,2))

PRINT returns a user-defined message to the client. See the BOL entry here. https://msdn.microsoft.com/en-us/library/ms176047.aspx

CONCAT returns a string that is the result of concatenating two or more string values. https://msdn.microsoft.com/en-GB/library/hh231515.aspx

Variables allow you to store data temporarily during the execution of code. If you define it at the command prompt, the variable is contained in your workspace. It is held in memory, but it can be saved to disk. In R, variables are dynamically typed so you can chop and change the type as you see fit. Variables are declared in the body of a batch or procedure with the DECLARE statement and are assigned values by using either a SET or SELECT statement. Variables are not dynamically typed, unlike R. For in-depth look at variables, see Itzik Ben-Gan’s article here.
ls allows you to list the variables and functions in your workspace. you can use ls.str to list out some additional information about each variable. SQL Server has tables, not arrays. It works differently, and you can find a great explanation over at Erland Sommarskog’s blog. For SQL Server 2016 specific information, please visit the Microsoft site.
A Vector is a key data structure in R, which has tons of flexibility and extras. Vectors can’t have a mix of data types, and they are created using the c(…) operator. If it is a vector of vectors, R makes them into a single vector. Batch-mode execution is sometimes known as vector-based or vectorized execution. It is a query processing method in which queries process multiple rows together. A popular item in SQL Server 2016 is Columnstore Indexes, which uses batch-mode execution. To dig into more detail, I’d recommend Niko Neugebauer’s excellent blog series here, or the Microsoft summary.

There will be plenty of other examples, but I hope that helps for now.

Learning pathway for SQL Server 2016 and R Part 1: Installation and configuration

Jargogled is an archaic word for getting confused or mixed up. I’m aware that there are lots of SQL Server folks out there, who are desperate to learn R, but might be jaRgogled by R. Now that R is in SQL Server, it seems like the perfect opportunity to start a new blog series to help people to splice the two great technologies together. So here you are!

First up, what do you need to know about SQL Server installation with R? The installation sequence is well documented here. However, if you want to make sure that the R piece is installed, then you will need to make sure that you do one thing: tick the Advanced Analytics Extension box.

SQL Server 2016 R Feature Selection

You need to select ‘Advanced Analytics Extensions’, which you will find under ‘Instance Features’. Once you’ve done that, you are good to proceed with the rest of your installation.

Once SQL Server is installed, let’s get some data into a SQL Server database. Firstly, you’ll need to create a test database, if you don’t have one already. You can find some information on database creation in SQL Server 2016 over at this Microsoft blog. You can import some data very quickly and there are different ways of importing data. If you need more information on this, please read this Microsoft blog.

If you fancy taking some sample data, try out the UCI Machine Learning data repository. You can download some data from there, following the instructions on that site, and then pop it into SQL Server.

If you have Office x64 installed on your machine, you might run into an issue:

Microsoft.ACE.OLEDB.15.0′ provider is not registered on the local machine

I ran into this issue when I tried to import some data into SQL Server using the quick and dirty ‘import data’ menu item in SSMS. After some fishing around, I got rid of it by doing the following:

There are other ways of importing data, of course, but I wanted to play with R and SQL Server, and not spend a whole chunk of time importing data.

In our next tutorial, we will look at some of the vocabulary for R and SQL Server which can look confusing for people from both disciplines. Once you learn the terminology, then you’ll see that you already know a lot of the concepts in R from your SQL Server and Business Intelligence expertise. That expertise will help you to springboard to R expertise, which is great for your career.