Index Build strategy in SQL Server - Part 3: Offline Serial/Parallel Partitioning

There are two main categories of partitioned index build:

  • Aligned (when base object and in-build index use the same partition schema)
  • Not- Aligned (when heap and index use different partition schemas (including the case when base object is not partitioned at all and in-build index use partitions))

(see Index Build strategy in SQL Server - Introduction (II)

 

Aligned partitioned index build

 

Aligned partitioned serial index build

 

     NL

                / \

             CTS Builder (write data to the in-build index)

                           \

                        [Sort] (order by index key) <-- optional

                             \

                          Scan (read data from source)

 

CTS: Constant Table Scan (the purpose of CTS is to provide partition IDs for index builder)

NL: Nested Loop

 

In case of aligned partition index build Constant Table Scan provides partition IDs to the inner side of Nested Loop so we can build one partition at a time – for each partition ID provided by CTS (outer side of the NL), inner side would build the index for that partition (not the entire index). Sort table gets created for each partition, but as we are doing it one by one and we are building final b-tree for each partition one by one we don’t need to keep this sort table for each partition at the same time. As a result we have only one sort table at any given moment.

How does it affect disc space requirements:

- In case of sorting in user’s database (default setting) we are actually sorting in each filegroup for each corresponding partition, which means we will need the same 2.2*(Size of the partition) for each filegroup. For example: let’s say we have 3 partitions located in filegroups FG1, FG2, FG3 and index data takes 1Gb, 2Gb, and 3Gb respectively. In this case we will need 2.2*1 = 2.2Gb of free space in FG1; 2.2*2 = 4.4Gb of free space in FG2, and 2.2*3 = 6.6Gb of free space in FG3. That means we will need total 9.9Gb of free space on the disk(s).

- In case of using Sort_in_tempdb (SORT_IN_TEMPDB = ON) option we can reuse the same space of tempdb for the sort table and as we are sorting partitions one by one we actually will need only 2.2*(Size of the biggest partition) of free space in tempdb. For example: let’s look at the case described above. We will need only 3.3Gb of free space in tempdb (this size of free space will let us build smaller partitions and then reuse this space for building the biggest partition).

 

Memory consideration:

We will have only one sort tables at the same time and we will need at least 40pages for the sort table to be able to start the index build operation. So, the minimum required memory will be 40pages.

Total memory = minimum required memory + additional memory*.

*additional memory calculated as row size multiplied by the estimated row count provided by Query Optimizer.  

 

Posted by: Lyudmila Fokina

Comments