SQL Pagination for bulk data transfer with Logic Apps
Introduction
Data transfer is one of the key tasks for integration developer and if it is a bulk data transfer with rows in millions for single call to database then the developer needs to think about the approach which will suit the solution
In this TechNet wiki, we will go through the method of SQL pagination and how we can implement pagination within Azure Logic Apps .We will do a bulk data extract from Azure SQL database and send to blob storage from were partners can read the files for processing. For those who are new to concept of pagination
“Pagination is the process of dividing the transaction result set into discrete subsets. “.
Pagination helps you to control the result output and provide cleaner structure to the result set. This means if you have a record-set of 100 k then you can divide it into 50 k each or a recordset as per your business requirement. This will be called as page size. The more generalized definition can be found on wiki https://en.wikipedia.org/wiki/Pagination .
The more generalized example of pagination can be seen on Bing or Google where your search result is being divided across multiple pages.
The concept of pagination is cross data platform and you can use the method of pagination within Oracle, DB2 with OFFSET/FETCH clause.
Solution Design
In this solution, we are developing an interface for fictions company Adventure Works which holds the product data within Adventure Works DB. The required data is distributed across multiple tables and Adventure Works partner organization requires these data to be replicated on daily basis to import within their own data warehouse system. The exchange file format should be comma separated format (csv) between both the organizations.
The application will be using Azure Logic Apps and SQL connector to build a simple solution which will be triggered on daily basis and store the product csv data within Azure Blob storage.
The workflow action being listed below
- Logic App is configured with recurrence trigger with recurrence interval set to 1 day.
- Logic App has SQL connector configured to call stored procedure LogicAppPaginationSample.
- The stored procedure set the page count and page size as an input parameter.
- LogicAppPaginationSample stored procedure is looped within Do until loop till the result set is empty.
Prerequisite
To work with the scenario, you need to have a valid subscription to Microsoft Azure. You can register for free subscription at link: Jump https://azure.microsoft.com/en-us/free/
Implementation
For this solution, we have used the sample Adventure Works database. We have created a stored procedure LogicAppPaginationSample which makes a cross join within Product, Product Category, and Product Model. The number of rows returned in single transaction is 1548610
The sample stored procedure is listed below
Create Procedure LogicAppPaginationSample
@pageNumber int,
@pageSize Int
As begin
select
product.[ProductNumber] As [ProductNumber],
product.[Name] As ProductName,
product.[ListPrice] As ListPrice,
ProductCategory.[Name] As CategoryName,
ProductModel.[Name] As ModelName,
ProductDescription.[Description] As [ProductDescription]
FROM [SalesLT].product product
cross join [SalesLT].ProductCategory ProductCategory
cross join [SalesLT].ProductModel ProductModel
left join [SalesLT].ProductDescription ProductDescription on
ProductDescription.ProductDescriptionID=ProductModel.ProductModelID
ORDER BY product.ProductID
OFFSET @PageSize * (@PageNumber - 1) ROWS
FETCH NEXT @PageSize ROWS ONLY
END
To test the Pagination within SQL Management Studio you can execute the stored procedure with page size and page number. The SQL Query will be like below
DECLARE @return_value int
EXEC @return_value = [dbo].[LogicAppPaginationSample]
@pageNumber = 2,
@pageSize = 5000
SELECT 'Return Value' = @return_value
GO
The next step is to build the logic apps and associated blob container to store the product details as csv.
The logic app workflow will be like below
It will contain certain set of actions with recurrence trigger set to 1 hour. The related action are Do-Until Loop, Create CSV Table, and a call to the stored procedure which is being created in earlier step.
There are certain consideration while dealing with large volume transaction data
- Select Proper azure SQL tier which will give you max performance benefit
- Do indexing on the table for fast querying
- Build Logic App with minimum actions.
The Actual run history will look like below where the stored procedure will fetch 50 K records each time until the SQL result set is set to empty.
Once the Logic app workflow is completed, you can see the csv files being created at the blob storage. With this basic workflow, you can implement complex business case with joining event grid or artificial intelligence of the data extracted. You can de-batch the record set to initiate another workflow which can do crud operation with other data contracts.
For this sample solution navigate to the blob storage and you will multiple product files generated through logic apps and each file will hold 50 K records which with Header value set. The conversion of JSON to CSV is being done through a newly added connector Create CSV table which takes JSON array as input and convert the result set into CSV string.
To verify the result set you can download any of the csv files created and open with excel or notepad++ to get the actual row count and result set. In this sample, we have used notepad to open one of the file content.
Conclusion
SQL pagination is very much essential when you are doing data operation with the large set of records. It has in memory benefits. With SQL pagination you are controlling the output record set of the stored procedure which might be essential for any downstream system which would not process big record set at one time
See Also
An important place to find a huge amount of BizTalk related articles is the TechNet Wiki itself.
The best entry point is BizTalk Server Resources on the TechNet Wiki. Another important place to find Logic App related articles is the TechNet Wiki itself. The best entry point is Logic App Resources on the TechNet Wiki.