SQL 2012 Split Partition Problem: A severe error occurred on the current command. The results, if any, should be discarded

We encountered an unexpected exception error a few months ago while attempting to split an existing partition for the next future month. All the details of the problem, troubleshooting, and solution are provided below, but the root cause was determined to be a product limitation in SQL2K12 RTM to which a CU is expected by Feb/Mar 2013. Until then the only workaround is to drop all existing partitioned indexed views, split the partition, then re-create the partitioned indexed views.

Problem: 16TB data warehouse partitioned by month on a datetime key column. A new file group, data file, and partition are created for the next month during the last week of the current month. This monthly maintenance task occurred without issue while DW was hosted in SQL2K8 R2, however, it failed after the DW was migrated to a new server running SQL2K12 RTM. The FG and file created successfully and the NEXT USED command completed successfully. The exact failure point occured at the following command:

ALTER PARTITION FUNCTION PFEventDateTimeRangeByMonth ()

SPLIT RANGE (N'2012-10-01T00:00:00.000') 

The command would run for several seconds before generating a stack dump in the SQL log and returning the error "A severe error occurred on the current command. The results, if any, should be discarded" in the results pane.

 

Troubleshooting: Attempted all of the following to try and resolve ~

0) Verified that all other monthly partition operations in 11 other DW's on the same server were working just fine.

1) Tried splitting the partition using a variety of date ranges, file group, and data file configurations.

2) Thought maybe a recent partition switch/merge operation for old data might have precipitated the problem. Restored the last 2K8 full backup before the switch/merge had occurred and tried split again unsuccessfully (BTW - customers loved having the DW offline for 6 days while I restored the db and reloaded the delta data for no reason)

3) Restored same full backup to Test server and ran full CHECKDB. It surprisingly finished within a faster than anticipated 48 hours, but no corruption was found.

4) Escalated the issue to the product team where they confirmed a product limitation that would need to be resolved via CU. In the meantime, they worked diligently to find a workaround.

 

Resolution:  The product team explained that the bug was related to how the indexed view metadata was unexpectedly blocking the split operation. Not sure that means the metadata was corrupted, but rather it did not change to an expected state as part of the split operation. We see similar issues from time-to-time with trans. replication publication metadata in SQL2K8.  

The workaround identified required us to drop the 2 indexed views, split the partition, then rebuild the indexed views. Not knowing for sure when the CU would be released, we opted to limit risk by creating new partitions for the next 8 months. When recreating the last indexed view (and the largest one) the operation failed and returned the same severe error from the original issue, however, no stack dump or exception was generated in the SQL log. Because the production maintenance window was closing fast, I decided to reduce the number of records in the indexed view by 50% by adjusting the WHERE filter. We were fairly confident the new range would satisfy most business use case scenarios. Thankfully the 2nd attempt with the adjusted filter worked after a few hours build time.

- Jonathan Foster