Paging a Query with SQL Server
Introduction
Sometimes it is necessary to optimize the data returned by a query, removing unnecessary fields from a SELECT statement, and including conditions in your WHERE clause so that the user can retrieve only the data that really needed.
It is very important in the critical conditions for small companies, with limited hardware or software. The places where Internet access is limited and slow we also have the need to reduce network traffic data.
These are factors that require a query to retrieve only the data that will be displayed in the User Interface screen, excluding the extra that maybe used in another page, displaying of a "Grid View" or "Report".
One of the features that can be used to meet these requirements is the Paging in the SQL Server.
The main objective of this article is to show how a simple T-SQL script can be useful, when well implemented.
This article is part Data Paging serie.
Building the Environment for Testing ** **
So we can reproduce the use of pagination in SQL queries and their advantages for any environment where data are trafficked, we will create a table for demonstration with a one million rows. So we can better assess the performance of different ways to query data.
In the end of this article, we will evaluate the query performance under the following conditions:
1 - No pagination query in the SQL Server ( normally used by programmers, leaving the data pagination task to a component as a "GridView" or other similar);
2 - Pagination Query, using ROW_NUMBER (traditional model of data pagination in SQL Server);
3 - Pagination Query, using the OFFSET and FETCH clauses (new model, used from the SQL Server 2012+);
Let's create the table with sample data:
--CREATING A TABLE FOR DEMO
CREATE TABLE dbo.TB_EXAMPLE(
ID_EXAMPLE int NOT NULL IDENTITY(1,1),
NM_EXAMPLE varchar(25) NOT NULL,
DT_CREATE datetime NULL DEFAULT(GETDATE())
);
GO
-- INSERTING 1,000,000 DIFFERENT ROWS IN THE TABLE
INSERT INTO TB_EXAMPLE (NM_EXAMPLE) VALUES ('Item de paginação ' + CONVERT(VARCHAR,ISNULL(@@IDENTITY, 0)))
GO 1000000
--QUERYING 1,000,000 ROWS
SELECT * FROM TB_EXAMPLE
GO
See this output SQL script in the image below.
Paging rows with Limit ** **
In order to understand the pagination concepts in T-SQL, with ROW_NUMBER and with OFFSET / FETCH, let's define a result pattern in the T-SQL script, for an evaluation of the above queries.
We created two variables to facilitate the data manipulation:
@PageNumber - Sets the number of the page to display
@RowspPage - Sets the rows number of each page
After setting these variables, let's start using the function ROW_NUMBER, sorting the data by ID_EXAMPLE field.
See the script below, where we define to be displayed the "page 2" with "5 rows per page":
--VIEWING THE PAGE "2" WITH 5 ROWS
DECLARE @PageNumber AS INT, @RowspPage AS INT
SET @PageNumber = 2
SET @RowspPage = 5
SELECT * FROM (
SELECT ROW_NUMBER() OVER(ORDER BY ID_EXAMPLE) AS NUMBER,
ID_EXAMPLE, NM_EXAMPLE, DT_CREATE FROM TB_EXAMPLE
) AS TBL
WHERE NUMBER BETWEEN ((@PageNumber - 1) * @RowspPage + 1) AND (@PageNumber * @RowspPage)
ORDER BY ID_EXAMPLE
See this output SQL script in the image below.
We can see that a simple mathematical operation that sets the first and the last rows for each page.
As this example uses the ROW_NUMBER function is important to know that the first row is the position number "one".
In this case we define the @PageNumber variable with value equal to 2 and @RowspPage variable with value to 5. Then we have in the "page 2":
- the first row with the alias "NUMBER" equal to 6
- the last row with the alias "NUMBER" equal to 10
Paging in SQL Server 2012 ** **
In the SQL Server 2012 a new feature was added in the ORDER BY clause, to query optimization of a set data, making work easier with data paging for anyone who writes in T-SQL as well for the entire Execution Plan in SQL Server.
Below the T-SQL script with the same logic used in the previous example.
--CREATING A PAGING WITH OFFSET and FETCH clauses IN "SQL SERVER 2012"
DECLARE @PageNumber AS INT, @RowspPage AS INT
SET @PageNumber = 2
SET @RowspPage = 10
SELECT ID_EXAMPLE, NM_EXAMPLE, DT_CREATE
FROM TB_EXAMPLE
ORDER BY ID_EXAMPLE
OFFSET ((@PageNumber - 1) * @RowspPage) ROWS
FETCH NEXT @RowspPage ROWS ONLY;
See this output SQL script in the image below.
We can see that this mathematical operation is simple as the previous one, but the data control process is different due to the new features of the SQL Server 2012 for the ORDER BY clause.
This example uses the OFFSET function, then is important to know that the first record starts in the "zero" position.
In this case we define the @PageNumber variable with value 2 and the @RowspPage variable with value 10. So we have this page:
- the first row in the 11 position
- the last row in the 20 position
In this SELECT statement, the query in this table using OFFSET function makes simpler pagination, but this new query is more efficient than the query using **ROW_NUMBER **?
Now, we will analyze the performance.
Comparing the Queries - ROW_NUMBER vs OFFSET/FETCH ** **
We can see that the pagination of a SELECT statement is simple and has a considerable performance gain. In critical situations, the data paging feature can help to make the user's work very simple and fast.
Now we will evaluate the performance of three different queries under the following conditions:
**1 - ** Simple query without pagination in SQL Server (responsibility to maintain the pagination is of an Application ASP.Net, WinForms or other).
2 - Query with pagination, using ROW_NUMBER (traditional model of pagination in T-SQL);
3 - Query with pagination, using OFFSET e FETCH clauses (new model, using SQL Server 2012 or higher);
-- 1 - SIMPLE QUERY
SELECT * FROM TB_EXAMPLE ORDER BY ID_EXAMPLE
GO
See "script 1" in the image below.
We can see that the simple query without pagination, all the data are used by query and not only the data that the user needs.
This is a loss for resources: CPU, I/O disk and memory consumed on the SQL Server instance.
Moreover, the time for showing data to the user is much higher because it's necessary to prepare the content in the user interface(UI) for paging through a data collection. In large data volumes and concurrency from use of these same records, it can become a problem.
Let's see the cost of the Execution Plan in this query on SQL Server.
See Execution Plan of the "script 1" in the image below.
This Execution Plan shows the cost of doing the full tablescan due to the lack of an index on the table in a simplified structure.
**See the Client Statistics of the ****"script 1" **in the image below.
To analyze the query performance, we get 3 execution samples and mainly compared the Statistic Time ("Client processing time" and "Total execution time") with better performance.
In the first query (Script 1), That is a simple SELECT statement, the best performance has:
- Client processing time (ms) = 1728
- Total execution time (ms) = 1848
Network traffic is small for each client request, but very high for the server processing and to return data from one query to a client:
- Bytes sent from client = 310
**- **Bytes **received from server = 39983840
**
In addition to this excessive time, you will need to create a paging process in the "UI" or your application component ("Grid View", "Report" or other) create this pagination automatically.
Therefore, the processing time to data presentation will be even greater. Now we will run T-SQL script using ROW_NUMBER for data pagination.
NOTE: Remembering that this model of data pagination works with SQL Server 2005 or higher.
-- 2 - QUERY USING "ROW_NUMBER"
DECLARE @PageNumber AS INT, @RowspPage AS INT
SET @PageNumber = 2
SET @RowspPage = 10
SELECT * FROM (
SELECT ROW_NUMBER() OVER(ORDER BY ID_EXAMPLE) AS Numero,
ID_EXAMPLE, NM_EXAMPLE , DT_CREATE FROM TB_EXAMPLE
) AS TBL
WHERE Numero BETWEEN ((@PageNumber - 1) * @RowspPage + 1) AND (@PageNumber * @RowspPage)
ORDER BY ID_EXAMPLE
GO
See "script 2" in the image below.****
We can see that in the query using ROW_NUMBER, data pagination provides only rows that the user needs.
This makes the query more efficient, reducing: CPU process, I/O disk and memory, consumed by SQL Server instance.
The time to display data to user also decreases because the data content is ready to view the data through of a user interface (UI).
Let's see the cost of the Execution Plan in this query on SQL Server.
**See Execution Plan of the **"script 2" in the image below (click image to enlarge).
In this Execution Plan, we see a more complex structure where the SQL Server organizes to obtain only the data requested in the query.
Although this processing is much larger than shown in the Execution Plan previous (script 1), the running time will be shorter due to the amount of data that is returned to the application.
Now we will evaluate the performance of "script 2" under the following conditions.
See the "script 2" Client Statistics in the image below
In this second query (script 2 - using ROW_NUMBER), the best execution time has:
- Client processing time (ms) = 4
**- Total execution time (ms) = 156
**
The client network traffic have a high cost to sent data, but is reduced considerably to the server send data to client application:
- Bytes sent from client = 988
**- **Bytes received from server = 35352
In addition to this short time, with little more than 10 times the query time in "script 1", we can easily see that the reduction of rows returned by query also reflect the decreased amount of bytes sent from the server to the client.
Therefore, the processing time of the data and content presentation to user decreases considerably and reduces network traffic between the server and the client application.
We now execute the T-SQL script using OFFSET/FETCH for data paging.
Remember: the OFFSET/FETCH pagination queries only work with SQL Server 2012 or higher.
-- 3 - QUERY USING "OFFSET" AND "FETCH NEXT" (SQL SERVER 2012)
DECLARE @PageNumber AS INT, @RowspPage AS INT
SET @PageNumber = 2
SET @RowspPage = 10
SELECT ID_EXAMPLE, NM_EXAMPLE, DT_CREATE
FROM TB_EXAMPLE
ORDER BY ID_EXAMPLE
OFFSET ((@PageNumber - 1) * @RowspPage) ROWS
FETCH NEXT @RowspPage ROWS ONLY
GO
**See "Script 3" in the image below.
**
We can see that the query using** OFFSET/FETCH**, the data pagination provides only rows that the user needs, similar to the "Script 2" (to use ROW_NUMBER).
The data query is also reduced and the process is even more simple.
Let's see the cost of the Execution Plan in this query on SQL Server.
**See Execution Plan of the **"script 3" in the image below.
In this Execution Plan, we see a simple structure where the SQL Server can obtain only the data requested in the query.
Although this processing be similar to that shown in the Execution Plan of the "Script 1", the execution time will be shorter due to the amount of data that are returned to the user in the application.
Now we will evaluate the performance of "script 3" under the following conditions.
See the "script 3" Client Statistics in the image below.
In the third query (script 3 - using OFFSET/FETCH), the best execution time has:
- Client processing time (ms) = 4
- Total execution time (ms) = 130
The client network traffic has a high cost to sent data, but is reduced considerably to the server send data to client application:
- Bytes sent from client = 804
**- **Bytes **received from server = 21563
**
The processing time was more reduced, with performance a bit better than query of the "script 2" (that use ROW_NUMBER), we can see the reduction in Bytes sent and received, which consequently reduced impact the execution time of the whole process.
Conclusion
We can see that the pagination of a SELECT statement is simple to set up and can be considered an excellent resource for large amounts of data.
In the data optimization query by T-SQL has a considerable gain both using the ROW_NUMBER as use of OFFSET / FETCH, but taking into account the analysis of these data we can see that the "script 3" query (using OFFSET / FETCH ) may be the best option for large volumes of rows, if you use a SQL server 2012 version.
References
See Also
- Wiki: Portal of TechNet Wiki Portals
- How to insert Serial Numbers in repeater and gridview in ASP.NET
- SQL Server: How to Find the First Available Timeslot for Scheduling
- Transact-SQL Portal
Other Languages
This article was awarded the silver medal in the TechNet Guru of March 2014