sql server 自动每日表分区 [英] sql server automated daily table partitioning
问题描述
我需要重新设计一个包含每日日志并有数亿行的表.这个概念是将数据仅保留一个月.该表将每天分区,使用日期时间字段作为分区键.例如,对于三月份,我需要有 31 个文件组和 31 个分区.进入 4 月后,该进程必须插入 4 月 1 日的分区并删除 3 月 1 日的数据和文件组.
I need to re design a table which holds daily logs and has hundreds of million rows. The concept is to keep data only for a one month period. The table will be partitioned per day, using a datetime field as the partition key. So for example for March i need to have 31 filegroups and 31 partitions. Once we get into April the process will have to insert into April 1st's partition and delete March 1st data and filegroup.
我对创建分区和文件组很满意,我正在寻找有关如何创建自动删除过程的建议.也许一个 sql 作业将使用 dateadd(m,-1,getdate())
删除分区?我还有什么遗漏吗?
I am comfortable with the partition and filegroup creation, i am looking for suggestions on how to create the automated deletion process. Maybe an sql job which will drop the partition using dateadd(m,-1,getdate())
? Is there something else i am missing?
附加问题:我将在 31-12-2020 之前创建 year_month_day_partitions 和相应的文件组.有没有办法自动创建这些脚本,或者我们必须在适当的时候手动运行脚本?
Additional question: I will create year_month_day_partitions and the corresponding filegroups until 31-12-2020. Is there any way to create those automaticalluy or do we have to run the scripts manually when it is time?
推荐答案
滑动窗口分区维护通常使用计划脚本或存储过程(SQL Server 代理作业或其他计划系统)来完成.应该计划维护以避免在 SPLIT
和 MERGE
期间进行昂贵的数据移动,因为这需要大约 4 倍于正常 DML 操作的日志记录.为此,在MERGE
之前确保包含边界值的分区为空,并且SPLIT
时没有行大于指定的边界.我建议创建一些额外的未来分区作为缓冲区,以避免数据移动,以防维护未按计划运行.
Sliding window partition maintenance is usually done using a scheduled script or stored procedure (SQL Server Agent job or other scheduling system). Maintenance should be planned as to avoid costly data movement during SPLIT
and MERGE
because that requires about 4 times the logging as normal DML operations. To this end, make sure the partition that includes the boundary value is empty before MERGE
and no rows are greater than the specified boundary when SPLIT
. I suggest creating a few extra future partitions as a buffer to avoid data movement in case the maintenance isn't run as planned.
以下是每日滑动窗口维护脚本示例.由于您使用的是 SQL Server 2005 并且分区级 TRUNCATE
已在 SQL Server 2016 中引入,因此它使用类似的分区临时表进行清除.请注意,SQL Server 2005 已不受支持.
Below is an example daily sliding window maintenance script. This uses a similarly partitioned staging table for the purge since you are using SQL Server 2005 and partition-level TRUNCATE
was introduced in SQL Server 2016. Note that SQL Server 2005 is out of support.
我从您的评论中看到,您认为每个分区一个单独的文件组/文件可能有助于删除分区,但事实并非如此.此示例对所有分区使用单个文件组.
I see from your comment that you believe a separate filegroup/file per partition may be useful to drop partitions but that is not the case. This example uses a single filegroup for all partitions.
--example setup
CREATE PARTITION FUNCTION PF_Date (datetime) AS
RANGE RIGHT FOR VALUES();
CREATE PARTITION SCHEME PS_LogTable AS
PARTITION PF_Date ALL TO ([PRIMARY]);
DECLARE @PartitionBoundaryDate datetime = DATEADD(day, -31, DATEADD(day, DATEDIFF(day, '', GETDATE()), ''));
WHILE @PartitionBoundaryDate < DATEADD(day, 1, GETDATE())
BEGIN
ALTER PARTITION SCHEME PS_LogTable NEXT USED [PRIMARY];
ALTER PARTITION FUNCTION PF_Date() SPLIT RANGE(@PartitionBoundaryDate);
SET @PartitionBoundaryDate = DATEADD(day, 1, @PartitionBoundaryDate)
END;
CREATE TABLE dbo.LogTable(DateColumn datetime INDEX cdx CLUSTERED) ON PS_LogTable(DateColumn);
CREATE TABLE dbo.LogTable_Staging(DateColumn datetime INDEX cdx CLUSTERED) ON PS_LogTable(DateColumn);
GO
--example partition maintenance scheduled nightly after midnight
BEGIN TRY
SET NOCOUNT ON;
SET XACT_ABORT ON;
DECLARE @RetentionDays int = 31;
DECLARE @FutureDays int = 7;
DECLARE @OldestRetainedDate datetime = DATEADD(day, -@RetentionDays, DATEADD(day, DATEDIFF(day, '', GETDATE()), ''));
DECLARE @LatestRetainedDate datetime = DATEADD(day, DATEDIFF(day, '', GETDATE()), '');
DECLARE @LatestFutureBoundaryDate datetime = DATEADD(day, @FutureDays, @LatestRetainedDate);
DECLARE @PartitionBoundaryDate datetime;
DECLARE @Message nvarchar(2048);
--make sure staging table is empty
TRUNCATE TABLE dbo.LogTable_Staging;
BEGIN TRAN;
--aquire exclusive table lock to avoid deadlocking during maintenance
SELECT TOP(0) @PartitionBoundaryDate = DateColumn FROM dbo.LogTable WITH(TABLOCKX);
--purge partition 1 in case data older than the first boundary was inserted
SET @Message = 'Purging partition 1';
PRINT @Message;
ALTER TABLE dbo.LogTable SWITCH
PARTITION 1 TO
dbo.LogTable_Staging PARTITION 1;
TRUNCATE TABLE dbo.LogTable_Staging;
--purge and remove expired partitions
DECLARE @PartitionBoundaries TABLE(PartitionBoundaryDate datetime NOT NULL PRIMARY KEY);
INSERT INTO @PartitionBoundaries(PartitionBoundaryDate)
SELECT CAST(prv.value AS datetime)
FROM sys.partition_functions AS pf
JOIN sys.partition_range_values AS prv ON prv.function_id = pf.function_id
WHERE
pf.name = N'PF_Date'
AND CAST(prv.value AS datetime) < @OldestRetainedDate;
DECLARE ExpiredPartitionBoundaries CURSOR LOCAL FAST_FORWARD FOR
SELECT PartitionBoundaryDate
FROM @PartitionBoundaries;
OPEN ExpiredPartitionBoundaries;
WHILE 1 = 1
BEGIN
FETCH NEXT FROM ExpiredPartitionBoundaries INTO @PartitionBoundaryDate;
IF @@FETCH_STATUS = -1 BREAK;
SET @Message = 'Purging data for ' + CONVERT(char(10), @PartitionBoundaryDate, 120);
PRINT @Message;
ALTER TABLE dbo.LogTable SWITCH
PARTITION $PARTITION.PF_Date(@PartitionBoundaryDate) TO
dbo.LogTable_Staging PARTITION $PARTITION.PF_Date(@PartitionBoundaryDate);
TRUNCATE TABLE dbo.LogTable_Staging;
ALTER PARTITION FUNCTION PF_Date() MERGE RANGE(@PartitionBoundaryDate);
END;
CLOSE ExpiredPartitionBoundaries;
DEALLOCATE ExpiredPartitionBoundaries;
--create partitions for future days
SET @PartitionBoundaryDate = DATEADD(day, 1, @LatestRetainedDate);
WHILE @PartitionBoundaryDate < = @LatestFutureBoundaryDate
BEGIN
IF NOT EXISTS(SELECT 1
FROM sys.partition_functions AS pf
JOIN sys.partition_range_values AS prv ON prv.function_id = pf.function_id
WHERE
pf.name = N'PF_Date'
AND CAST(prv.value AS datetime) = @PartitionBoundaryDate
)
BEGIN
SET @Message = 'Creating partition for ' + CONVERT(char(10), @PartitionBoundaryDate, 120);
PRINT @Message;
ALTER PARTITION SCHEME PS_LogTable NEXT USED [PRIMARY];
ALTER PARTITION FUNCTION PF_Date() SPLIT RANGE(@PartitionBoundaryDate);
END;
SET @PartitionBoundaryDate = DATEADD(day, 1, @PartitionBoundaryDate);
END;
COMMIT;
END TRY
BEGIN CATCH
IF @@TRANCOUNT > 0 ROLLBACK;
--better to use THROW in SQL 2012 and later
DECLARE
@ErrorNumber int
,@ErrorMessage nvarchar(2048)
,@ErrorSeverity int
,@ErrorState int
,@ErrorLine int;
SELECT
@ErrorNumber =ERROR_NUMBER()
,@ErrorMessage =ERROR_MESSAGE()
,@ErrorSeverity = ERROR_SEVERITY()
,@ErrorState =ERROR_STATE()
,@ErrorLine =ERROR_LINE();
RAISERROR('Error %d caught at line %d: %s'
,@ErrorSeverity
,@ErrorState
,@ErrorNumber
,@ErrorLine
,@ErrorMessage);
END CATCH;
GO
这篇关于sql server 自动每日表分区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!