分区以提高 SQL Server 2008 中的查询性能 [英] Partitioning for query performance in SQL Server 2008

查看:35
本文介绍了分区以提高 SQL Server 2008 中的查询性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个场景,其中有大量关于项目的状态数据.物品的状态每分钟都在更新,在不久的将来会有大约50,000个物品.这样,在一个月内,将有大约 2,232,000,000 行数据.在归档旧数据之前,我必须在主表中保留至少 3 个月.

I have a scenario in which there's a huge amount of status data about an item. The item's status is updated from minute to minute, and there will be about 50,000 items in the near future. So that, in one month, there will be about 2,232,000,000 rows of data. I must keep at least 3 months in the main table, before archieving older data.

我必须计划实现基于特定项目(其 ID)和数据范围(通常最多一个月范围)的快速查询 - 例如从表中选择 A、B、C,其中 ItemID = 3000 并且日期在 '2010-10-01' 和 '2010-10-31 23:59:59.999' 之间

I must plan to achieve quick queries, based on a specific item (its ID) and a data range (usually, up to one month range) - e.g. select A, B, C from Table where ItemID = 3000 and Date between '2010-10-01' and '2010-10-31 23:59:59.999'

所以我的问题是如何设计一个分区结构来实现这一点?

So my question is how to design a partitioning structure to achieve that?

目前,我根据项目的唯一标识符"(一个整数)mod分区数"进行分区,以便所有分区均等分布.但是它的缺点是在表上保留一个额外的列作为分区函数的分区列,因此,将行映射到它的分区.所有这些都增加了一点额外的存储空间.此外,每个分区都映射到不同的文件组.

Currently, I'm partitioning based on the "item's unique identifier" (an int) mod "the number of partitions", so that all partitions are equally distributed. But it has the drawback of keeping one additional column on the table to act as the partition column to the partition function, therefore, mapping the row to its partition. All that add a little bit of extra storage. Also, each partition is mapped to a different filegroup.

推荐答案

分区从来都不是为了查询性能.使用分区后,性能总是更差,您所能希望的最好的情况是没有大的回归,但永远不会有所改善.

Partitioning is never done for query performance. With partitioning the performance will always be worse, the best you can hope for is no big regression, but never improvement.

对于查询性能,分区可以做的任何事情,索引都可以做得更好,这应该是你的答案:适当的索引.

For query performance, anything a partition can do, and index can do better, and that should be your answer: index appropriately.

分区对于 IO 路径控制情况(分布在存档/当前卷上)或 ETL 负载中的快速切入切出场景很有用.所以我理解如果你有一个滑动窗口并按日期分区,这样你就可以快速切换出不再需要保留的数据.

Partitioning is useful for IO path control cases (distribute on archive/current volumes) or for fast switch-in switch-out scenarios in ETL loads. So I would understand if you had a sliding window and partition by date so you can quickly switch out the data that is no longer needed to be retained.

分区的另一个狭窄情况是最后一页插入闩锁争用,如解决高并发 INSERT 工作负载上的 PAGELATCH 争用.

Another narrow case for partitioning is last page insert latch contention, like described in Resolving PAGELATCH Contention on Highly Concurrent INSERT Workloads.

您的分区方案和用例似乎不适合它会受益的任何场景(也许是最后一个场景,但从描述中不清楚),所以很可能会受到伤害性能.

Your partition scheme and use case does not seem to fit any of the scenarios in which it would benefit (maybe is the last scenario, but is not clear from description), so most likely it hurts performance.

这篇关于分区以提高 SQL Server 2008 中的查询性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆