SQL Server supports table/index partitions that store multiple chunks of row data in different partitions. Starting in 2016 SP1, this is no longer just an enterprise edition-only feature. In a partitioned table/index, data in different partitions can be spread across multiple file groups. The awesome thing about partitioning is it is treated as one single logical structure during queries, pretty transparent to the end-users.
Why do I need to partition?
You have a very large database with tables storing data since the Cretaceous Period. However, your end-users don’t even go that far back and only want reports with revolving 12 months of data. Some analysts surprise you by writing aggregate queries to get the data from the past 3 years. Now is the time to think about Partitions (Dude, it is too late to even utter a word about your data retention policy, but still remember it is better late than never). Let us understand the concept and key benefits of partitions below.
- Loading or removing a large volume of data from a partitioned table is fast (Partition Switch). A partition switch can happen only if indexes are partition aligned as well. Aligned indexes are essentially indexes that are built on the same partition scheme as its base table but do not necessarily have to use the same partition function.
- Loading data from application databases to partitioned warehouse fact tables will be efficient and less time-consuming. Also accessing a small subset of data can be quicker.
- Boom, your analysts will thank you for enhanced query performance (only if they use it right). Using a partitioning key in the query clause, the optimizer will access only relevant partitions (Partition Elimination). When joining partitioned tables, make sure your partitioning columns are the same as the columns on which the tables are joined to get better performance.
- Maintenance on partitioned tables becomes easier to manage like targeting certain partitions at a time to rebuild indexes.
How to create a partitioned table?
Partitions can be created on one filegroup but ideally, you want to create multiple filegroups across different storage tiers to store different partitions. This can help you perform backups on partitions.
- Create File Groups and add Files
- Create a Partition Function – Maps data of a table/index into partitions based on the values of a specified column known as a Partitioning Column. RIGHT includes the lower boundary (< boundary) and LEFT includes the upper boundary (<= boundary).
- Create a Partition Scheme – Maps partitions of a partitioned table/index to filegroups. Adding an additional filegroup [Primary] will help in case where you forget to add new ranges and new filegroups as rows not belonging to any partition ranges will be stored there.
- Create or modify a table/index and specify the partition scheme.
/* Create File Groups and add Files */ /* Add 2 new filegroups */ ALTER DATABASE PartitionDB ADD FILEGROUP Partition_FG1 GO ALTER DATABASE PartitionDB ADD FILEGROUP Partition_FG2 GO -- Add one file for each filegroup ALTER DATABASE PartitionDB ADD FILE (NAME = Partition_F1, FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\Partition_F1.ndf', SIZE = 5MB, MAXSIZE = 100MB, FILEGROWTH = 5MB ) TO FILEGROUP Partition_FG1; GO ALTER DATABASE PartitionDB ADD FILE (NAME = Partition_F2, FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\Partition_F2.ndf', SIZE = 5MB, MAXSIZE = 100MB, FILEGROWTH = 5MB ) TO FILEGROUP Partition_FG2; GO ----------------------------------------------------------------------------------- /* Create a Partition Function */ CREATE PARTITION FUNCTION PF_PartitionbyMonth (datetime) AS RANGE RIGHT FOR VALUES (N'2021-01-01T00:00:00.000', N'2021-02-01T00:00:00.000'); ----------------------------------------------------------------------------------- /* Create a Partition Scheme */ CREATE PARTITION SCHEME PS_PartitionbyMonth AS PARTITION PF_PartitionbyMonth TO (Partition_FG1, Partition_FG2, Primary) ; ----------------------------------------------------------------------------------- /* Create or modify a table/index and specify the partition scheme. */ CREATE TABLE PartitionTable (PartitioningColumn datetime PRIMARY KEY, ColumnDescription varchar(250)) ON PS_PartitionbyMonth (PartitioningColumn) ;
Automate table partitioning
With initial planning, a maximum range can be estimated and added to the partition function. However, new ranges and even filegroups need to be added, and should not be a manual process. Create a SQL Agent Job to run on a schedule that finds partition functions whose range is coming to an end and adds new ranges and creates new filegroups as needed.
/* Tip: Run the below query for any partitioned table to get details about partitions.*/ SELECT DISTINCT o.name as table_name, rv.value as partition_range, fg.name as file_groupName, p.partition_number, p.rows as number_of_rows FROM sys.partitions p INNER JOIN sys.indexes i ON p.object_id = i.object_id AND p.index_id = i.index_id INNER JOIN sys.objects o ON p.object_id = o.object_id INNER JOIN sys.system_internals_allocation_units au ON p.partition_id = au.container_id INNER JOIN sys.partition_schemes ps ON ps.data_space_id = i.data_space_id INNER JOIN sys.partition_functions f ON f.function_id = ps.function_id INNER JOIN sys.destination_data_spaces dds ON dds.partition_scheme_id = ps.data_space_id AND dds.destination_id = p.partition_number INNER JOIN sys.filegroups fg ON dds.data_space_id = fg.data_space_id LEFT OUTER JOIN sys.partition_range_values rv ON f.function_id = rv.function_id AND p.partition_number = rv.boundary_id WHERE o.object_id = OBJECT_ID('PartitionTable');