Problem

When you work on SQL 2000 or some higher version but not Enterprise edition, you often encounter the difficulty for creating table partitioning.  You often create some base tables such as Sales2007, Sales2008,… to store yearly data and then create a view to query those data. This approach is sometime called physical partitioned table

However, if you have Enterprise Edition, it is very easy for us to create it and called logical partitioned table.

Implementation

Partitioned table is to improve performance of our queries basically. In this article, I would like to demonstrate how we create a partitioned table. I have 5 million rows  of data that contains Sales data from 2010 to 2015. Now, I need to query data of 2010. How can you think about the storage solution to return data in good performance.

The idea of partitioning is to split data into separate data set so we can query exact data set instead we must go to whole data.

Determine partitioning columns

The most important factor is to determine which columns we will use to split data such as a ID column, a Date column. In my example, I determined to use OrderDate (datetime data type) for partitioning table.

Create new file groups

Why do we need new file groups? By default, a database has only a PRIMARY file group. When we want to use a partitioned table, we have to determine how many partitions we want to spread out data? In my example, my Sales data contains 2009 to 2015 so I will split data as below

< 01-01-2010

 

01-01-2010 to 12-31-2010

01-01-2011 to 12-31-2011

01-01-2012 to 12-31-2012

01-01-2013 to 12-31-2013

> 01-01-2014

USE [master]
GO
IF NOT EXISTS (SELECT name FROM sys.databases WHERE name = N’SalesPartition’)
BEGIN
CREATE DATABASE [SalesPartition]
CONTAINMENT = NONE
ON PRIMARY
( NAME = N’SalesPartition’, FILENAME = N’D:\DataSQL2014\SalesPartition.mdf’ , SIZE = 5120KB , MAXSIZE = UNLIMITED, FILEGROWTH = 10240KB ),
FILEGROUP [SalesPartition2009]
( NAME = N’SalesPartition2009′, FILENAME = N’D:\DataSQL2014\SalesPartition2009.ndf’ , SIZE = 10240KB , MAXSIZE = 1048576KB , FILEGROWTH = 1024KB ),
FILEGROUP [SalesPartition2010]
( NAME = N’SalesPartition2010′, FILENAME = N’D:\DataSQL2014\SalesPartition2010.ndf’ , SIZE = 10240KB , MAXSIZE = 1048576KB , FILEGROWTH = 1024KB ),
FILEGROUP [SalesPartition2011]
( NAME = N’SalesPartition2011′, FILENAME = N’D:\DataSQL2014\SalesPartition2011.ndf’ , SIZE = 10240KB , MAXSIZE = 1048576KB , FILEGROWTH = 1024KB ),
FILEGROUP [SalesPartition2012]
( NAME = N’SalesPartition2012′, FILENAME = N’D:\DataSQL2014\SalesPartition2012.ndf’ , SIZE = 10240KB , MAXSIZE = 1048576KB , FILEGROWTH = 1024KB ),
FILEGROUP [SalesPartition2013]
( NAME = N’SalesPartition2013′, FILENAME = N’D:\DataSQL2014\SalesPartition2013.ndf’ , SIZE = 73728KB , MAXSIZE = 1048576KB , FILEGROWTH = 1024KB ),
FILEGROUP [SalesPartition2014]
( NAME = N’SalesPartition2014′, FILENAME = N’D:\DataSQL2014\SalesPartition2014.ndf’ , SIZE = 235520KB , MAXSIZE = 1048576KB , FILEGROWTH = 1024KB ),
FILEGROUP [SalesPartition2015]
( NAME = N’SalesPartition2015′, FILENAME = N’D:\DataSQL2014\SalesPartition2015.ndf’ , SIZE = 1048576KB , MAXSIZE = 2GB , FILEGROWTH = 1024KB )
LOG ON
( NAME = N’SalesPartition_Retail_log’, FILENAME = N’D:\DataSQL2014\SalesPartition_log.mdf’ , SIZE = 634880KB , MAXSIZE = 2048GB , FILEGROWTH = 10240KB )
END

GO

ALTER DATABASE [SalesPartition] SET COMPATIBILITY_LEVEL = 120
GO

ALTER DATABASE [SalesPartition] SET ANSI_NULL_DEFAULT OFF
GO

ALTER DATABASE [SalesPartition] SET ANSI_NULLS OFF
GO

ALTER DATABASE [SalesPartition] SET ANSI_PADDING OFF
GO

ALTER DATABASE [SalesPartition] SET ANSI_WARNINGS OFF
GO

ALTER DATABASE [SalesPartition] SET ARITHABORT OFF
GO

ALTER DATABASE [SalesPartition] SET AUTO_CLOSE OFF
GO

ALTER DATABASE [SalesPartition] SET AUTO_SHRINK OFF
GO

ALTER DATABASE [SalesPartition] SET AUTO_UPDATE_STATISTICS ON
GO

ALTER DATABASE [SalesPartition] SET CURSOR_CLOSE_ON_COMMIT OFF
GO

ALTER DATABASE [SalesPartition] SET CURSOR_DEFAULT GLOBAL
GO

ALTER DATABASE [SalesPartition] SET CONCAT_NULL_YIELDS_NULL OFF
GO

ALTER DATABASE [SalesPartition] SET NUMERIC_ROUNDABORT OFF
GO

ALTER DATABASE [SalesPartition] SET QUOTED_IDENTIFIER OFF
GO

ALTER DATABASE [SalesPartition] SET RECURSIVE_TRIGGERS OFF
GO

ALTER DATABASE [SalesPartition] SET ENABLE_BROKER
GO

ALTER DATABASE [SalesPartition] SET AUTO_UPDATE_STATISTICS_ASYNC OFF
GO

ALTER DATABASE [SalesPartition] SET DATE_CORRELATION_OPTIMIZATION OFF
GO

ALTER DATABASE [SalesPartition] SET TRUSTWORTHY OFF
GO

ALTER DATABASE [SalesPartition] SET ALLOW_SNAPSHOT_ISOLATION OFF
GO

ALTER DATABASE [SalesPartition] SET PARAMETERIZATION SIMPLE
GO

ALTER DATABASE [SalesPartition] SET READ_COMMITTED_SNAPSHOT OFF
GO

ALTER DATABASE [SalesPartition] SET HONOR_BROKER_PRIORITY OFF
GO

ALTER DATABASE [SalesPartition] SET RECOVERY SIMPLE
GO

ALTER DATABASE [SalesPartition] SET MULTI_USER
GO

ALTER DATABASE [SalesPartition] SET PAGE_VERIFY CHECKSUM
GO

ALTER DATABASE [SalesPartition] SET DB_CHAINING OFF
GO

ALTER DATABASE [SalesPartition] SET FILESTREAM( NON_TRANSACTED_ACCESS = OFF )
GO

ALTER DATABASE [SalesPartition] SET TARGET_RECOVERY_TIME = 0 SECONDS
GO

ALTER DATABASE [SalesPartition] SET DELAYED_DURABILITY = DISABLED
GO

ALTER DATABASE [SalesPartition] SET READ_WRITE
GO

Create a partition function

Microsoft defined a partition function “A database object that defines how the rows of a table or index are mapped to a set of partitions based on the values of certain column, called a partitioning column. That is, the partition function defines the number of partitions that the table will have and how the boundaries of the partitions are defined. For example, given a table that contains sales order data, you may want to partition the table into twelve (monthly) partitions based on a datetime column such as a sales date.

Syntax

CREATE PARTITION FUNCTION <function name>(data type of partitioning column) AS RANGE LEFT | RIGHT FOR VALUES (list of values)

All data types are valid except  except text, ntext, image, xml, timestamp, varchar(max), nvarchar(max), varbinary(max), alias data types, or CLR user-defined data types.

In my example, I determined to use a date time column

IF NOT EXISTS (SELECT * FROM sys.partition_functions WHERE name = N’SalesRetailRange’)
CREATE PARTITION FUNCTION [SalesRetailRange](datetime) AS RANGE RIGHT FOR VALUES ( N’2010-01-01′,

N’2011-01-01′,

N’2012-01-01′,

N’2013-01-01′,

N’2014-01-01′)

It means that I want to have the list of subset data as below. There are 6 partitions

 

What is difference between LEFT and RIGHT?  Let me give you an example

CREATE PARTITION FUNCTION [Testing1](INT) AS RANGE LEFT FOR VALUES(10,20) It means there are 3 partitions

CREATE PARTITION FUNCTION [Testing1](INT) AS RANGE RIGHT FOR VALUES(10,20) It means there are also 3 partitions but values are different on every partition

 

Create a partition scheme

A database object that maps the partitions of a partition function to a set of filegroups. The primary reason for placing your partitions on separate filegroups is to make sure that you can independently perform backup operations on partitions. This is because you can perform backups on individual filegroups.

IF NOT EXISTS (SELECT * FROM sys.partition_schemes WHERE name = N’SalesRetailScheme’)
CREATE PARTITION SCHEME [SalesRetailScheme] AS PARTITION [SalesRetailRange] TO ([Sales2009], [Sales2010], [Sales2011], [Sales2012], [Sales2013], [Sales2014], [Sales2015])

Create a partitioned table

IF OBJECT_ID(‘Sales’) IS NOT NULL
DROP TABLE Sales

GO
CREATE TABLE Sales
(
SalesOrderID int NOT NULL IDENTITY (1, 1) NOT FOR REPLICATION,
OrderDate datetime NOT NULL, — Partion column
DueDate datetime NULL,
ShipDate datetime NULL,
Status tinyint NULL,
SalesOrderNumber nvarchar(50),
CustomerID nvarchar(50),
SalesPersonID nvarchar(50),
TerritoryID nvarchar(50),
BillToAddressID nvarchar(50),
ShipToAddressID nvarchar(50),
ShipMethodID nvarchar(50),
CreditCardID nvarchar(50),
CreditCardApprovalCode nvarchar(15) NULL,
CurrencyRateID nvarchar(50),
SubTotal nvarchar(50),
TaxAmt nvarchar(50),
Freight nvarchar(50),
TotalDue nvarchar(50),
Comment nvarchar(128) NULL,
CarrierTrackingNumber nvarchar(25) NULL,
OrderQty nvarchar(50) NULL,
UnitPrice nvarchar(50) NULL,
UnitPriceDiscount nvarchar(50) NULL,
LineTotal nvarchar(50),
SnapshotDate Datetime
) ON SalesRetailScheme([OrderDate])
GO

The difference is to identify which FILEGROUP. Normally, you must identity which FILEGROUP when you create a table such PRIMARY filegroup. In case of partitioning, we must identify the partition scheme, it will help use control how data will be partitioning?

Import data

This step is to import data into Sales table that is the partitioned table. Please contact me to get source data

Query partitioned table

After inserting data into Sales tables, let’s check the partitions to see how data is partitioned

SELECT o.name objectname, partition_id, partition_number, [rows]
FROM sys.partitions p
INNER JOIN sys.objects o ON o.object_id=p.object_id
WHERE o.name LIKE ‘%Sales%’

Benefit of partitioning and limitation

Click here to understand how a partitioned table improve performance and what limitation?

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s