Problem
When you work on SQL 2000 or some higher version but not Enterprise edition, you often encounter the difficulty for creating table partitioning. You often create some base tables such as Sales2007, Sales2008,… to store yearly data and then create a view to query those data. This approach is sometime called physical partitioned table
However, if you have Enterprise Edition, it is very easy for us to create it and called logical partitioned table.
Implementation
Partitioned table is to improve performance of our queries basically. In this article, I would like to demonstrate how we create a partitioned table. I have 5 million rows of data that contains Sales data from 2010 to 2015. Now, I need to query data of 2010. How can you think about the storage solution to return data in good performance.
The idea of partitioning is to split data into separate data set so we can query exact data set instead we must go to whole data.
Determine partitioning columns
The most important factor is to determine which columns we will use to split data such as a ID column, a Date column. In my example, I determined to use OrderDate (datetime data type) for partitioning table.
Create new file groups
Why do we need new file groups? By default, a database has only a PRIMARY file group. When we want to use a partitioned table, we have to determine how many partitions we want to spread out data? In my example, my Sales data contains 2009 to 2015 so I will split data as below
< 01-01-2010
01-01-2010 to 12-31-2010
01-01-2011 to 12-31-2011
01-01-2012 to 12-31-2012
01-01-2013 to 12-31-2013
> 01-01-2014
USE [master]
GO
IF NOT EXISTS (SELECT name FROM sys.databases WHERE name = N’SalesPartition’)
BEGIN
CREATE DATABASE [SalesPartition]
CONTAINMENT = NONE
ON PRIMARY
( NAME = N’SalesPartition’, FILENAME = N’D:\DataSQL2014\SalesPartition.mdf’ , SIZE = 5120KB , MAXSIZE = UNLIMITED, FILEGROWTH = 10240KB ),
FILEGROUP [SalesPartition2009]
( NAME = N’SalesPartition2009′, FILENAME = N’D:\DataSQL2014\SalesPartition2009.ndf’ , SIZE = 10240KB , MAXSIZE = 1048576KB , FILEGROWTH = 1024KB ),
FILEGROUP [SalesPartition2010]
( NAME = N’SalesPartition2010′, FILENAME = N’D:\DataSQL2014\SalesPartition2010.ndf’ , SIZE = 10240KB , MAXSIZE = 1048576KB , FILEGROWTH = 1024KB ),
FILEGROUP [SalesPartition2011]
( NAME = N’SalesPartition2011′, FILENAME = N’D:\DataSQL2014\SalesPartition2011.ndf’ , SIZE = 10240KB , MAXSIZE = 1048576KB , FILEGROWTH = 1024KB ),
FILEGROUP [SalesPartition2012]
( NAME = N’SalesPartition2012′, FILENAME = N’D:\DataSQL2014\SalesPartition2012.ndf’ , SIZE = 10240KB , MAXSIZE = 1048576KB , FILEGROWTH = 1024KB ),
FILEGROUP [SalesPartition2013]
( NAME = N’SalesPartition2013′, FILENAME = N’D:\DataSQL2014\SalesPartition2013.ndf’ , SIZE = 73728KB , MAXSIZE = 1048576KB , FILEGROWTH = 1024KB ),
FILEGROUP [SalesPartition2014]
( NAME = N’SalesPartition2014′, FILENAME = N’D:\DataSQL2014\SalesPartition2014.ndf’ , SIZE = 235520KB , MAXSIZE = 1048576KB , FILEGROWTH = 1024KB ),
FILEGROUP [SalesPartition2015]
( NAME = N’SalesPartition2015′, FILENAME = N’D:\DataSQL2014\SalesPartition2015.ndf’ , SIZE = 1048576KB , MAXSIZE = 2GB , FILEGROWTH = 1024KB )
LOG ON
( NAME = N’SalesPartition_Retail_log’, FILENAME = N’D:\DataSQL2014\SalesPartition_log.mdf’ , SIZE = 634880KB , MAXSIZE = 2048GB , FILEGROWTH = 10240KB )
ENDGO
ALTER DATABASE [SalesPartition] SET COMPATIBILITY_LEVEL = 120
GOALTER DATABASE [SalesPartition] SET ANSI_NULL_DEFAULT OFF
GOALTER DATABASE [SalesPartition] SET ANSI_NULLS OFF
GOALTER DATABASE [SalesPartition] SET ANSI_PADDING OFF
GOALTER DATABASE [SalesPartition] SET ANSI_WARNINGS OFF
GOALTER DATABASE [SalesPartition] SET ARITHABORT OFF
GOALTER DATABASE [SalesPartition] SET AUTO_CLOSE OFF
GOALTER DATABASE [SalesPartition] SET AUTO_SHRINK OFF
GOALTER DATABASE [SalesPartition] SET AUTO_UPDATE_STATISTICS ON
GOALTER DATABASE [SalesPartition] SET CURSOR_CLOSE_ON_COMMIT OFF
GOALTER DATABASE [SalesPartition] SET CURSOR_DEFAULT GLOBAL
GOALTER DATABASE [SalesPartition] SET CONCAT_NULL_YIELDS_NULL OFF
GOALTER DATABASE [SalesPartition] SET NUMERIC_ROUNDABORT OFF
GOALTER DATABASE [SalesPartition] SET QUOTED_IDENTIFIER OFF
GOALTER DATABASE [SalesPartition] SET RECURSIVE_TRIGGERS OFF
GOALTER DATABASE [SalesPartition] SET ENABLE_BROKER
GOALTER DATABASE [SalesPartition] SET AUTO_UPDATE_STATISTICS_ASYNC OFF
GOALTER DATABASE [SalesPartition] SET DATE_CORRELATION_OPTIMIZATION OFF
GOALTER DATABASE [SalesPartition] SET TRUSTWORTHY OFF
GOALTER DATABASE [SalesPartition] SET ALLOW_SNAPSHOT_ISOLATION OFF
GOALTER DATABASE [SalesPartition] SET PARAMETERIZATION SIMPLE
GOALTER DATABASE [SalesPartition] SET READ_COMMITTED_SNAPSHOT OFF
GOALTER DATABASE [SalesPartition] SET HONOR_BROKER_PRIORITY OFF
GOALTER DATABASE [SalesPartition] SET RECOVERY SIMPLE
GOALTER DATABASE [SalesPartition] SET MULTI_USER
GOALTER DATABASE [SalesPartition] SET PAGE_VERIFY CHECKSUM
GOALTER DATABASE [SalesPartition] SET DB_CHAINING OFF
GOALTER DATABASE [SalesPartition] SET FILESTREAM( NON_TRANSACTED_ACCESS = OFF )
GOALTER DATABASE [SalesPartition] SET TARGET_RECOVERY_TIME = 0 SECONDS
GOALTER DATABASE [SalesPartition] SET DELAYED_DURABILITY = DISABLED
GOALTER DATABASE [SalesPartition] SET READ_WRITE
GO
Create a partition function
Microsoft defined a partition function “A database object that defines how the rows of a table or index are mapped to a set of partitions based on the values of certain column, called a partitioning column. That is, the partition function defines the number of partitions that the table will have and how the boundaries of the partitions are defined. For example, given a table that contains sales order data, you may want to partition the table into twelve (monthly) partitions based on a datetime column such as a sales date.”
Syntax
CREATE PARTITION FUNCTION <function name>(data type of partitioning column) AS RANGE LEFT | RIGHT FOR VALUES (list of values)
All data types are valid except except text, ntext, image, xml, timestamp, varchar(max), nvarchar(max), varbinary(max), alias data types, or CLR user-defined data types.
In my example, I determined to use a date time column
IF NOT EXISTS (SELECT * FROM sys.partition_functions WHERE name = N’SalesRetailRange’)
CREATE PARTITION FUNCTION [SalesRetailRange](datetime) AS RANGE RIGHT FOR VALUES ( N’2010-01-01′,N’2011-01-01′,
N’2012-01-01′,
N’2013-01-01′,
N’2014-01-01′)
It means that I want to have the list of subset data as below. There are 6 partitions
What is difference between LEFT and RIGHT? Let me give you an example
CREATE PARTITION FUNCTION [Testing1](INT) AS RANGE LEFT FOR VALUES(10,20) It means there are 3 partitions
CREATE PARTITION FUNCTION [Testing1](INT) AS RANGE RIGHT FOR VALUES(10,20) It means there are also 3 partitions but values are different on every partition
Create a partition scheme
A database object that maps the partitions of a partition function to a set of filegroups. The primary reason for placing your partitions on separate filegroups is to make sure that you can independently perform backup operations on partitions. This is because you can perform backups on individual filegroups.
IF NOT EXISTS (SELECT * FROM sys.partition_schemes WHERE name = N’SalesRetailScheme’)
CREATE PARTITION SCHEME [SalesRetailScheme] AS PARTITION [SalesRetailRange] TO ([Sales2009], [Sales2010], [Sales2011], [Sales2012], [Sales2013], [Sales2014], [Sales2015])
Create a partitioned table
IF OBJECT_ID(‘Sales’) IS NOT NULL
DROP TABLE SalesGO
CREATE TABLE Sales
(
SalesOrderID int NOT NULL IDENTITY (1, 1) NOT FOR REPLICATION,
OrderDate datetime NOT NULL, — Partion column
DueDate datetime NULL,
ShipDate datetime NULL,
Status tinyint NULL,
SalesOrderNumber nvarchar(50),
CustomerID nvarchar(50),
SalesPersonID nvarchar(50),
TerritoryID nvarchar(50),
BillToAddressID nvarchar(50),
ShipToAddressID nvarchar(50),
ShipMethodID nvarchar(50),
CreditCardID nvarchar(50),
CreditCardApprovalCode nvarchar(15) NULL,
CurrencyRateID nvarchar(50),
SubTotal nvarchar(50),
TaxAmt nvarchar(50),
Freight nvarchar(50),
TotalDue nvarchar(50),
Comment nvarchar(128) NULL,
CarrierTrackingNumber nvarchar(25) NULL,
OrderQty nvarchar(50) NULL,
UnitPrice nvarchar(50) NULL,
UnitPriceDiscount nvarchar(50) NULL,
LineTotal nvarchar(50),
SnapshotDate Datetime
) ON SalesRetailScheme([OrderDate])
GO
The difference is to identify which FILEGROUP. Normally, you must identity which FILEGROUP when you create a table such PRIMARY filegroup. In case of partitioning, we must identify the partition scheme, it will help use control how data will be partitioning?
Import data
This step is to import data into Sales table that is the partitioned table. Please contact me to get source data
Query partitioned table
After inserting data into Sales tables, let’s check the partitions to see how data is partitioned
SELECT o.name objectname, partition_id, partition_number, [rows]
FROM sys.partitions p
INNER JOIN sys.objects o ON o.object_id=p.object_id
WHERE o.name LIKE ‘%Sales%’
Benefit of partitioning and limitation
Click here to understand how a partitioned table improve performance and what limitation?