列存储索引 - 段创建 [英] Columnstore indexes - segment creation

查看:85
本文介绍了列存储索引 - 段创建的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对细分和词典有一些疑问。

Have some questions regarding segments and dictionaries.

1。当我为具有17496061条记录的事实表创建列存储索引时,为每列创建了18个段。段0到15具有相同的行数。但是段16和17具有不同的数字或行,段16的行比段17少了b $ b。这种行为的指针是什么?

1. When I created a columnstore index for a fact table with 17496061 records, 18 segments were created for each column. Segments 0 to 15 were having the same number of rows. However segments 16 and 17 are having different number or rows with segment 16 having less rows than segment 17. Any pointers for this behaviour?

2。列的行如何分布在每个段中?
中的最小和最大ids列 sys column_store_segments
表似乎有某种模式用于排列。

2. How the rows of the column are distributed in each of the segments? The min and max ids columns in the sys.column_store_segments table seems to have some kind of pattern for arrangement.

3. 怎么样e在查询执行期间访问的列的字典中的唯一
值。
中的数据指针列
sys column_store_dictionaries
表格指定?

3. How are the unique values in the dictionary for a column accessed during query execution. What does the data pointer column in the sys.column_store_dictionaries table specify?

推荐答案

段在多个线程上并行构建。当线程到达表的末尾时,它们可能没有完整的段值,因此它们可以构建一个小于完整段的段。

Segments are built in parallel on multiple threads. When threads hit the end of the table they might not have a full segment worth so they could build a segment that is smaller than the full ones.

如果表有一个聚簇B树,columnstore index build使用与B-tree相同的顺序来排序进入段的行。这样可以在查询执行期间使用分段最小值和最大值来改进分段消除,例如根据日期排序,
。如果输入数据是堆,则不会维持特定的顺序。

If the table has a clustered B-tree, columnstore index build uses the same order as the B-tree to order rows going into the segments. This is good to allow improved segment elimination using segment min and max values during query execution, say based on ordering on date. If the input data is a heap, then no particular order is maintained.

您的问题3是关于与调整无关的内部细节,所以我不打算回答它,其他而不是说在执行期间字典被固定在内存中,并且数据值ID用于查找字典以根据需要获得实际值

Your question 3 is about an internal detail that's not relevant for tuning so I'm not going to answer it, other than to say that dictionaries are pinned in memory during execution and data value IDs are used to look up into the dictionary to get actual values as needed.


这篇关于列存储索引 - 段创建的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆