sql server 自动每日表分区 [英] sql server automated daily table partitioning

查看:49
本文介绍了sql server 自动每日表分区的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要重新设计一个包含每日日志并有数亿行的表.这个概念是将数据仅保留一个月.该表将每天分区,使用日期时间字段作为分区键.例如,对于三月份,我需要有 31 个文件组和 31 个分区.进入 4 月后,该进程必须插入​​ 4 月 1 日的分区并删除 3 月 1 日的数据和文件组.

I need to re design a table which holds daily logs and has hundreds of million rows. The concept is to keep data only for a one month period. The table will be partitioned per day, using a datetime field as the partition key. So for example for March i need to have 31 filegroups and 31 partitions. Once we get into April the process will have to insert into April 1st's partition and delete March 1st data and filegroup.

我对创建分区和文件组很满意,我正在寻找有关如何创建自动删除过程的建议.也许一个 sql 作业将使用 dateadd(m,-1,getdate()) 删除分区?我还有什么遗漏吗?

I am comfortable with the partition and filegroup creation, i am looking for suggestions on how to create the automated deletion process. Maybe an sql job which will drop the partition using dateadd(m,-1,getdate())? Is there something else i am missing?

附加问题:我将在 31-12-2020 之前创建 year_month_day_partitions 和相应的文件组.有没有办法自动创建这些脚本,或者我们必须在适当的时候手动运行脚本?

Additional question: I will create year_month_day_partitions and the corresponding filegroups until 31-12-2020. Is there any way to create those automaticalluy or do we have to run the scripts manually when it is time?

推荐答案

滑动窗口分区维护通常使用计划脚本或存储过程(SQL Server 代理作业或其他计划系统)来完成.应该计划维护以避免在 SPLITMERGE 期间进行昂贵的数据移动,因为这需要大约 4 倍于正常 DML 操作的日志记录.为此,在MERGE之前确保包含边界值的分区为空,并且SPLIT时没有行大于指定的边界.我建议创建一些额外的未来分区作为缓冲区,以避免数据移动,以防维护未按计划运行.

Sliding window partition maintenance is usually done using a scheduled script or stored procedure (SQL Server Agent job or other scheduling system). Maintenance should be planned as to avoid costly data movement during SPLIT and MERGE because that requires about 4 times the logging as normal DML operations. To this end, make sure the partition that includes the boundary value is empty before MERGE and no rows are greater than the specified boundary when SPLIT. I suggest creating a few extra future partitions as a buffer to avoid data movement in case the maintenance isn't run as planned.

以下是每日滑动窗口维护脚本示例.由于您使用的是 SQL Server 2005 并且分区级 TRUNCATE 已在 SQL Server 2016 中引入,因此它使用类似的分区临时表进行清除.请注意,SQL Server 2005 已不受支持.

Below is an example daily sliding window maintenance script. This uses a similarly partitioned staging table for the purge since you are using SQL Server 2005 and partition-level TRUNCATE was introduced in SQL Server 2016. Note that SQL Server 2005 is out of support.

我从您的评论中看到,您认为每个分区一个单独的文件组/文件可能有助于删除分区,但事实并非如此.此示例对所有分区使用单个文件组.

I see from your comment that you believe a separate filegroup/file per partition may be useful to drop partitions but that is not the case. This example uses a single filegroup for all partitions.

--example setup
CREATE PARTITION FUNCTION PF_Date (datetime) AS
    RANGE RIGHT FOR VALUES();
CREATE PARTITION SCHEME PS_LogTable AS
    PARTITION PF_Date ALL TO ([PRIMARY]);
DECLARE @PartitionBoundaryDate datetime = DATEADD(day, -31, DATEADD(day, DATEDIFF(day, '', GETDATE()), ''));

WHILE @PartitionBoundaryDate < DATEADD(day, 1, GETDATE())
BEGIN
    ALTER PARTITION SCHEME PS_LogTable NEXT USED [PRIMARY];
    ALTER PARTITION FUNCTION PF_Date() SPLIT RANGE(@PartitionBoundaryDate);
    SET @PartitionBoundaryDate = DATEADD(day, 1, @PartitionBoundaryDate)
END;
CREATE TABLE dbo.LogTable(DateColumn datetime INDEX cdx CLUSTERED) ON PS_LogTable(DateColumn);
CREATE TABLE dbo.LogTable_Staging(DateColumn datetime INDEX cdx CLUSTERED) ON PS_LogTable(DateColumn);
GO

--example partition maintenance scheduled nightly after midnight
BEGIN TRY
    SET NOCOUNT ON;
    SET XACT_ABORT ON;
    DECLARE @RetentionDays int = 31;
    DECLARE @FutureDays int = 7;
    DECLARE @OldestRetainedDate datetime = DATEADD(day, -@RetentionDays, DATEADD(day, DATEDIFF(day, '', GETDATE()), ''));
    DECLARE @LatestRetainedDate datetime = DATEADD(day, DATEDIFF(day, '', GETDATE()), '');
    DECLARE @LatestFutureBoundaryDate datetime = DATEADD(day, @FutureDays, @LatestRetainedDate);
    DECLARE @PartitionBoundaryDate datetime;
    DECLARE @Message nvarchar(2048);

    --make sure staging table is empty
    TRUNCATE TABLE dbo.LogTable_Staging;

    BEGIN TRAN;
    --aquire exclusive table lock to avoid deadlocking during maintenance
    SELECT TOP(0) @PartitionBoundaryDate = DateColumn FROM dbo.LogTable WITH(TABLOCKX);

    --purge partition 1 in case data older than the first boundary was inserted
    SET @Message = 'Purging partition 1';
    PRINT @Message;
    ALTER TABLE dbo.LogTable SWITCH
        PARTITION 1 TO
        dbo.LogTable_Staging PARTITION 1;
    TRUNCATE TABLE dbo.LogTable_Staging;

    --purge and remove expired partitions
    DECLARE @PartitionBoundaries TABLE(PartitionBoundaryDate datetime NOT NULL PRIMARY KEY);
    INSERT INTO @PartitionBoundaries(PartitionBoundaryDate)
        SELECT CAST(prv.value AS datetime)
        FROM sys.partition_functions AS pf
        JOIN sys.partition_range_values AS prv ON prv.function_id = pf.function_id
        WHERE
            pf.name = N'PF_Date'
            AND CAST(prv.value AS datetime) < @OldestRetainedDate;
    DECLARE ExpiredPartitionBoundaries CURSOR LOCAL FAST_FORWARD FOR
        SELECT PartitionBoundaryDate
        FROM @PartitionBoundaries;

    OPEN ExpiredPartitionBoundaries;
    WHILE 1 = 1
    BEGIN
        FETCH NEXT FROM ExpiredPartitionBoundaries INTO @PartitionBoundaryDate;
        IF @@FETCH_STATUS = -1 BREAK;
        SET @Message = 'Purging data for ' + CONVERT(char(10), @PartitionBoundaryDate, 120);
        PRINT @Message;
        ALTER TABLE dbo.LogTable SWITCH
            PARTITION $PARTITION.PF_Date(@PartitionBoundaryDate) TO
            dbo.LogTable_Staging PARTITION $PARTITION.PF_Date(@PartitionBoundaryDate);
        TRUNCATE TABLE dbo.LogTable_Staging;
        ALTER PARTITION FUNCTION PF_Date() MERGE RANGE(@PartitionBoundaryDate);
    END;
    CLOSE ExpiredPartitionBoundaries;
    DEALLOCATE ExpiredPartitionBoundaries;

    --create partitions for future days
    SET @PartitionBoundaryDate = DATEADD(day, 1, @LatestRetainedDate);
    WHILE @PartitionBoundaryDate < = @LatestFutureBoundaryDate
    BEGIN
        IF NOT EXISTS(SELECT 1
            FROM sys.partition_functions AS pf
            JOIN sys.partition_range_values AS prv ON prv.function_id = pf.function_id
            WHERE
                pf.name = N'PF_Date'
                AND CAST(prv.value AS datetime) = @PartitionBoundaryDate
        )
        BEGIN
            SET @Message = 'Creating partition for ' + CONVERT(char(10), @PartitionBoundaryDate, 120);
            PRINT @Message;
            ALTER PARTITION SCHEME PS_LogTable NEXT USED [PRIMARY];
            ALTER PARTITION FUNCTION PF_Date() SPLIT RANGE(@PartitionBoundaryDate);
        END;
        SET @PartitionBoundaryDate = DATEADD(day, 1, @PartitionBoundaryDate);
    END;

    COMMIT;
END TRY
BEGIN CATCH

    IF @@TRANCOUNT > 0 ROLLBACK;

    --better to use THROW in SQL 2012 and later
    DECLARE
         @ErrorNumber int
        ,@ErrorMessage nvarchar(2048)
        ,@ErrorSeverity int
        ,@ErrorState int
        ,@ErrorLine int;

    SELECT
        @ErrorNumber =ERROR_NUMBER()
        ,@ErrorMessage =ERROR_MESSAGE()
        ,@ErrorSeverity = ERROR_SEVERITY()
        ,@ErrorState =ERROR_STATE()
        ,@ErrorLine =ERROR_LINE();

    RAISERROR('Error %d caught at line %d: %s'
        ,@ErrorSeverity
        ,@ErrorState
        ,@ErrorNumber
        ,@ErrorLine
        ,@ErrorMessage);

END CATCH;
GO

这篇关于sql server 自动每日表分区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆