Azure SQL Database is introducing two new features to cost-effectively migrate workloads to the cloud. SQL Database Hyperscale for single databases, available in preview, is a highly scalable service tier that adapts on demand to workload needs. It auto-scales up to 100 TB per database to significantly expand potential for app growth.
What does this mean? It’s one of the most fundamental changes to SQL Server storage since 7pm. So this is big: big news, and very big data stores. I am very lucky because I got to interviewe Kevin Farlee of the SQL Server team about the latest news, and you can find the video below.
I am sorry about the sound quality and I have blogged so that the message is clear. When I find the Ignite sessions published, I will add in a link as well.
What problem are the SQL Server team solving, with Hyperscale? The fundamental problem is how do you deal with very large databases in the cloud. VLDBs is the problems that people want to do with normal operations. All the problems with VLDBs occur due to the sheer size of data, such as backups, restores, maintenance operations, scaling. Sometimes these can take days to conduct these activities, and the business will not wait for these downtimes. If you are talking tens of terabytes, that takes day and ultimately Microsoft needed a new way to protect data and VLDBs. The SQL Team did something really smart and rethought very creatively on how they do storage, in order to take care of the issues with VLDBs in the cloud.
So, the Azure SQL Server team did something that is completely in line with one of the main benefits and key features of cloud architecture: they split out the storage engine from the relational engine. Storage implementation was completely rethought and remastered from the ground up. They took the viewpoint over how you would go about architecting, designing and building for these solutions in the cloud, if you were to start from scratch?
The Azure SQL Server database team did a smart thing: Azure SQL Server is using microservices to handle VLDBs.
The compute engine is one microservice which is taking care of it’s role, and then another microservice that is taking care of the logging, and then a series of microservices that handle data. These are called page servers, and they interface at the page level. The page servers host and maintain the data files. Each page server handles about a terabyte of data pages. You can add on as many as you need.
Ultimately, compute and storage are decoupled so you can scale compute without moving the data. This means it’s possible to keep adding more and more data, and it also means that you don’t have to deal with the movement of data. Moving data around when there are terabytes and terabytes of data isn’t a trivial task. The page servers have about a terabyte of data each, and the page servers have about a terabyte’s worth of SSD cache.
The ultimate storage is Azure Blob Storage, because blob storage is multiply redundant and it has features like snapshots, so this means that they can do simultaneous backups by just doing a snapshot across all of the blobs. This has no impact on workload.
Restores are just instantiating a new set of writeable disks from a set of snapshots, and works with the the page servers and the compute engine to take care of it, working in symphony. Since you’re not moving the data, it is faster.
I’m personally very impressed with the work that the team they’ve done, and I’d like to thank Kevin Farlee for his time. Kevin explains things exceptionally well.
It’s worth watching the video to understand it. As well as the video here, Kevin goes into detail in his Microsoft Ignite sessions, and I will publish more links when I have them.
One advantage in doing the MIcrosoft Community Reporter role is that I get to learn from the experts, and I enjoyed learning from Kevin throughout the video.
It seems to me that the Azure SQL database team have really heard the voice of their technical audience and they’ve worked passionately and hard to tackle these real life issues. I don’t know if it is always very clear that Microsoft is listening but I wanted to blog about it, since I can see how much the teams take on board the technical ‘voice’ from the people who care about their solutions, and who care enough to share their opinions and thoughts so that Microsoft can improve their solutions.
From the Azure architecture perspective, it works perfectly with the cloud computing concept of decoupling the compute and the storage. I love watching the data story unfold for Azure and I’m excited by this news.