Azure SQL: Move to New Database Tiers
Just a litter over a week ago, the Azure team has publicly announced new tiers for Azure SQL Database. These new tiers not only change the way pricing is to be considered when choosing a SQL Database scale, but also announce the dawn of the current Web and Business tiers and along the way, the end of Azure Federations. However, don’t panic yet! If your business still resides on these tiers or if you are using a sharding technique based on Azure Federations, be advised that the Web and Business tiers will only be retired in 12 months. Moreover, according to their announcement, Microsoft will ensure that customers will have at least 6 months from Basic, Standard and Premium general availability to migrate from Web and Business, even if this involves extending the 12 months window (Kelly, 2014).
Foreword
This article is targeted to application and database developers with at least a basic knowledge of SQL Server and Azure SQL Database. The article contains several T-SQL queries that are ran over the AdventureWorks sample database (the Azure OLTP sample), available for download at http://bit.ly/AzureAWDbSample.
Creating a new Azure SQL Database
Currently, when you create a new SQL Database, considering you have activated the New Service Tiers for SQL Databases preview for at least one of your subscriptions (otherwise, visit http://bit.ly/AzurePreview), the New SQL Database pop-up window will look similar to the one below (Figure 1 New SQL Database Pop-Up Window).
http://alexmang.ro/wp-content/uploads/2014/05/1-create_new.png
Figure 1 New SQL Database Pop-Up Window
As shown in the pop-up window, right from the beginning you have to choose from one of the available supported database editions, either Web and Business or Basic, Standard and Premium. The reason behind this is the completely different architecture of the machines that host the databases you create on the database servers. Obviously, this also means that there is basically no way for you to scale a Web or Business database to Basic or Standard or vice-versa. However, you can scale your Web or Business to a Premium database and even a Basic or Standard database to Premium and vice-versa, but there is absolutely no way of scaling your Web database to Basic, not even by scaling up to Premium and down to Basic, should have you just consider that options by now.
Moreover, not only does the Azure Management Portal limit your database scaling options, you are currently also limited to scale your database within your database server too (which in the end, might have been pretty obvious): running the following command will return a server-side error:
ALTER DATABASE AdventureWorks2012 MODIFY (Edition = 'Basic')
Msg 40823, Level 16, State 1, Line 1
Invalid value provided for parameter EDITION. Please provide a value that is valid on server version 1.0.
As you can see, the error states that the provided value is invalid for the server version 1.0 (which corresponds to a server hosting a Web, Business or Premium (version 1) database). Therefore, trying to scale to a tier that corresponds to server version 2.0 (Basic, Standard or Premium version 2) isn’t possible. However, this is only temporary, since Microsoft is currently working on making this feature publicly available so that users can start upgrading their databases to the new tiers as soon as possible.
Performance considerations to the new SQL Database tiers
If up until now, complaining about the Azure SQL Database performance was legit up to a point, the new tiers have been introduce exactly to meet customers’ expectation in this regard. Basically, Microsoft has introduced these three new tiers along with a variety of six different performance levels designed for both light-weight and heavy-weight transactional application demands. The new tiers and levels are Basic, Standard, which offers two different performance levels named S1 and S2 respectively, and Premium, which has three different performance levels, namely P1, P2 and P3.
According to Microsoft, the Basic tier is designed for small size database, supporting typically one single active operation at any time. Given this scenario, it’s also worth sharing that the Basic tier can be scaled up (from the size point of view), from 100MB to as much as 5GB, just like the Web tier, with the exception that a 500MB Basic database option is also available.
The Standard tier is designed as a middle-ware version, a go-to option for most cloud applications, where multiple concurrent connections and queries are ran over the database.
Last but especially not least, the Premium tier is designed for high transaction volume, supporting a large number of concurrent connections.
In order to quantify the performance of each database tier, Microsoft has created Azure SQL Database Benchmark (ASDB), a benchmark that runs a mix of common OLTP operations ran against Azure SQL Databases. Moreover, Microsoft has introduced a new term as well: Database Throughput Unit (DTU), which represents a way to describe the relative capacity of a performance level based on a blended measure of CPU, memory and read and write rates (Azure SQL Database Service Tiers and Performance Levels, 2014); since DTU is returned as an integer given for each single tier, a double DTU basically means a double performance and 5 times a DTU basically means 5 times more performance.
The table below is used as a reference in order to describe the performance of each new tier:
Service Tier/Performance Level |
DTU |
MAX DB Size |
Max Worker Threads |
Max Sessions |
Benchmark Transaction Rate |
Predictability |
Basic |
1 |
2 GB |
20 |
100 |
3,467 transactions per hour |
Good |
Standard/S1 |
5 |
250 GB |
50 |
200 |
283 transactions per minute |
Better |
Standard/S2 |
25 |
250 GB |
100 |
500 |
1,470 transactions per minute |
Better |
Premium/P1 |
100 |
500 GB |
200 |
2,000 |
98 transactions per second |
Best |
Premium/P2 |
200 |
500 GB |
400 |
4,000 |
192 transactions per second |
Best |
Premium/P3 |
800 |
500 GB |
1,600 |
16,000 |
730 transactions per second |
Best |
Table 1 Azure SQL Database Performance Model (Azure SQL Database Service Tiers and Performance Levels, 2014)
Unfortunately, nor DTUs neither transaction rates are disclosed for the old Web and Business tiers so choosing a specific tier might be difficult. However, after the Monitoring an Azure SQL Database section of this article, I will present different queries ran against the AdventureWorks2012 sample database in order to compare performance gains when using the new tiers.
Monitoring an Azure SQL Database
The new servers offer a refreshed monitoring section too meaning that new selectable metrics are available directly inside the portal:
Blocked connection by firewall
CPU percentage
Log Writes Percentage
Physical Data Reader Percentage
Storage
Throttled Connections
Having these metrics available, measuring whether your database requires scaling up or down becomes a child’s plays. Unlike before when you basically had absolutely no information on how high the CPU load was or on how much disk IO was being done by your queries, you now have the option of seeing this data directly in the portal (Figure 2 New Azure Portal Metrics for Azure SQL Database). Moreover, you also have the option of creating alert rules directly from the portal whenever any of these metrics hit a given rule policy applied by you.
http://alexmang.ro/wp-content/uploads/2014/05/2-monitoring.png
Figure 2 New Azure Portal Metrics for Azure SQL Database
In order to apply a rule on one of these new metrics, follow these steps:
Inside the Azure Management Portal, select SQL Databases
Select the database scaled to any one of the new available tiers
Open the Monitoring tab
Add any one of these metrics:
Storage
CPU Percentage
Physical Data Reads Percentage
Having one of the aforementioned metrics selected, click the ‘Add Rule’ button
Follow the steps in the wizard
http://alexmang.ro/wp-content/uploads/2014/05/3-rule.png
Figure 3 Using the Rules Feature
Another very useful way to monitor your database’s performance is to use the sys.resource_stats dynamic view which returns aggregated data for 5 minute intervals. I will talk more about it in the next section of this article.
Performance Considerations
UPDATE 31.05: After talking to Microsoft in regard to the unexpected test results I gained over the Web and Business tiers and the new set of SQL Database tiers, I realized that there's a major thing to keep in mind: comparing the old set of tiers with the new one is not noly cumbersome but only fully irrational, because right from the moment they were designed, the old set of tiers was designed only to meet database size requirements. However, the new set of design, because the old way was though of as flawly designed, has implemented performance soft limits too. This automatically means that there's performance predictability, unlike the Web and Business tiers which are complete unable to fully commit to a specific performance rate. There's a great blog post from Tobias Ternstrom here about performance in the new set of tiers.
The scripting approach for measuring the database resource usage is to use the dynamic view called sys.resource_stats. Each Azure SQL Database hosts this view in the master database and updates the aggregate data in the view every 5 minutes. Among the information returned, you get CPU cores used, physical read IOPS, physical write IOPS and others, so that this view is extremely useful for determining whether the current scale is either high enough or low enough for your current usage. In my next article, I will describe ways in which you can automate database scaling based both on performance and storage considerations.
Due to the preview period of the new tiers, it might take more than 5 minutes to update this data. It could be a good idea therefore to use this view currently just in order to make sure that the selected performance level or scale was correctly chosen. Use the following script in order to query the sys.resource_stats view:
SELECT *
FROM sys.resource_stats
WHERE database_name = 'AdventureWorks2012'
ORDER BY start_time DESC
More information on the sys.resource_stats view can be found here: http://msdn.microsoft.com/en-us/library/dn269979(v=sql.120).aspx. Moreover, please keep in mind that that during the preview period, the Azure SQL Database team might turn data collection off for short periods of time. Also, according to the team, you should not take a dependency on the existing implementation, because this might either change or be completely removed.
Because the use of DTUs might still be cryptic and because there is no current comparison between the existing generally available tiers (Web and Business) and the new tiers, I have decided to upload the AdventureWorks sample database on two different servers and scale them to both the old and new tiers. The sample databases are available on-line at http://bit.ly/AzureAWDbSample.
For the following tests, I have decided to include client statistics and consider the time results as valid test input.
This is how I tested the different available tiers:
First, I ran a COUNT query that joins another table on its clustered index, inside a WHILE loop for 3 times and all inside a single batch, so that the server only replied after computing the whole set and returned a single row. Thus, I could use the ‘Wait on server reply’ result as an input to my test results. This test ran for 5 times and the average results were considered.
I ran an UPDATE command on the Person table of the AdventureWorks2012 database, updating the entire database. Once again, the ‘Wait on server reply’ result was considered
I ran a query command with a join on an unindexed column, with a TOP option meaning that the query should return only 100000 results
All the queries were ran on the same cabled network
This is the test:
-- TEST 1
DECLARE @runTime INT
SET @runTime = 1
WHILE @runTime < 4
BEGIN
SELECT COUNT (s.SalesOrderDetailId)
FROM Sales.SalesOrderDetail s
INNER JOIN Production.Product p ON s.ProductID = p.ProductID
SET @runTime = @runTime + 1
END
-- TEST 2
UPDATE p
SET p.FirstName = p.FirstName + left(p.FirstName, 1)
FROM Person.Person p
-- TEST 3
SELECT TOP 100000 *
FROM Production.TransactionHistory th
INNER JOIN Production.TransactionHistoryArchive tha ON th.Quantity = tha.Quantity
And here are the results:
CPU |
14.88 |
17.3529 |
3.23 |
3.03 |
Looking at the resources used however, using the sys.resource_stats view I got these results aggregated for about an hour when I queried the AdventureWorks database:
|
Web |
S1 |
S2 |
P1 |
Test 1 |
145.5 |
2028 |
203.7142857 |
171.375 |
Test 2 |
4356.2 |
101876.3333 |
69542.6 |
17205.83 |
Test 3 |
3820.167 |
3484.833333 |
3592.571429 |
7350.8 |
Recover From ‘Oops’ Failures
During Build 2014, Scott Guthrie presented the new feature called Self-Service restore available in Azure SQL Database. Self-service restore is actually a feature currently available only for the new SQL Server tiers which gives you the option of restoring your database to a historical copy.
The way this works is by actually accessing one of the database’s automated exports. If for Web and Business tiers you previously had the option of configuring an automated export that ran periodically based on a configurable schedule and afterwards create a new database from one of the exports, you no longer have the option of doing so for the new tiers. However, the Azure SQL Database team came up with a much better solution for automatic backup, meaning Self-Service restore.
Since these database copies are actual databases hosted inside Azure, should you ever run an unfortunate query such as
UPDATE Sales.SalesOrderDetail
SET UnitPrice = 0
you are only left with the hope that your latest manual back-up is recent enough and not too much data will be lost. However, should you have forgotten to back up your database recently, the Azure SQL Database team has introduced a great feature called self-service restore which basically allows you to restore your database to an automatic copy from inside the datacenter. What this means is that you get sort of an ‘undo’ command for databases, and I personally believe that you’ll love it.
In order to restore your database, all you have to do is open up the Azure Management Portal and access the SQL Databases section. After selecting one of the new tier databases, click the ‘Restore’ button from the bottom command bar; the Restore window will pop-up. You have the option of selecting any available database between the most recent existing copy of the database or the oldest existing copy of the database, with the note that ‘the oldest’ is a term based on the database’s tier: 24 hours for Basic, 7 days for Standard and 35 days for Premium.
Dusk of Azure SQL Database Federations
The main reason for a database sharding technique is due to the fact that Azure SQL Database runs on commodity hardware, in contrast to a database hosted in SQL Server where you would usually buy hardware components specific to your data access techniques. Therefore, the idea for a performant database, besides scaling up the database, is to scale the database out; before the announcement of the retirement of the Web and Business tiers, a good approach for sharding your database was to use Azure SQL Database Federations. However, Azure SQL Database Federations isn’t supported in the newly announced tiers which basically means that this feature will be removed altogether (at least in the form it’s known today).
The announcement that this feature won’t be supported any more due to the retirement of the old tiers (Web an Business) might have come as a shock to several users, especially since most the documentation online suggested that your application’s database should be scaled out using sharding techniques, just like your entire application should.
However, the retirement of the old tiers along with Federations doesn’t mean that Microsoft no longer suggest that you split your database using sharding techniques; on the contrary, Microsoft suggests that you should indeed use sharding techniques, but rather use a custom sharding technique instead of using Federations. If you think about it, the statement is rather legit, since Federations lacked one or more critical features:
First of all, due to the added complexity to your architecture, performance scaling hit a limit of approximately 10 shards using Federations, even though the technical database limit is 148 (calculated as the maximum number of databases – 150 – minus the master and root database)
Second, whenever you would split your database into a new shard, there was no way to merge the shard back into the main database which was a huge hole for many different scenarios
Last but especially not least, Federations forced you as a developer to give up patterns of accessing individual databases
Concurrent USE FEDERATION operations impact the overall performance of your main database
Considering that you would shard your database your way, these disadvantages no longer apply. However, as an application or database developer, you are now forced to invest in the development of your own sharding pattern and give some special care to things like query routing, data movement along your shards and multi-database administration.
Recover From Disasters
If you have been using Azure SQL Database in production so far, you have probably thought at one point what the best disaster recovery would be. So far, the options were to either manually create database copies (.bacpac files) and deploy them to secondary databases in other regions or to create new VMs installed with SQL Server and have a running SQL DataSync agent on them so that you could replicate your data.
Along with the announcement of the new tiers, a new feature has also been announced: Active Geo-Replication. Active Geo-Replication asynchronously replicates committed transactions from a database to up to four copies of the database on a different server, whether they are on the same region or in a different region. According to the MSDN documentation on Active Geo-Replication, the active secondary data is guaranteed to always be transactionally consistent with changes committed to the primary database. (Active Geo-Replication for Azure SQL Database, 2014)
Using Active Geo-Replication thus enables other key scenarios too, such as database migration (you can copy your database from one server to another with minimum downtime) and application upgrades (you can use the active secondary database as a failback option). (Active Geo-Replication for Azure SQL Database, 2014)
Moreover, given the fact that each replicated copy of the primary database is readable, you could also use your secondary databases as hosts for your long running read queries, such as report queries.
One thing to keep in mind though is that Active Geo-Replication is available only for the Premium tier and all the secondary copies it creates will thus be Premium databases too.
Given the fact that Active Geo-Replication normally asynchronously copies your data to the other servers (due to the high latency of WAN networks), in order to prevent any loss of critical data should a disaster occur right after your transaction has been committed, you have the option of calling a system stored procedure called sp_wait_for_database_copy_sync which forces the copy to occur immediately after committing the transaction and blocks the calling thread until all the copies have been done to the active secondary database.
Other facts
One of the most important facts about the new SQL Database tiers is that even though these are not yet generally available, Microsoft fully supports them, meaning that you get support through your Azure support subscription to these tiers.
References
Active Geo-Replication for Azure SQL Database. (2014, 04 24). Retrieved from MSDN Documentation:
http://msdn.microsoft.com/en-us/library/azure/dn741339.aspx
Azure SQL Database Service Tiers and Performance Levels. (2014, April 24). Retrieved from MSDN:
http://msdn.microsoft.com/en-us/library/azure/dn741336.aspx
Kelly, E. (2014, April 24). Azure SQL Database introduces new service tiers. Retrieved from Windows Azure MSDN Blogs: