SSAS表格模型中的分区 [英] Partitioning in SSAS tabular Model

查看:155
本文介绍了SSAS表格模型中的分区的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们使用SSAS 2016建立了表格模型。


模型就像单个事实[大约300个字段]和30个维度[100个字段]。我们使用视图从每个表中获取表格模型的数据。


事实上保存的数据接近350 GB [约5000万个数据]。在表格模型中,我们通过事实视图派生了大约400个字段的额外列。我们没有在SSAS中使用任何派生列。


总体而言我们可以说表格模型中的800个字段。我们有大约100 GB的RAM。表格模型DB文件大小在处理后在物理驱动器中显示为4 GB。


我几乎没有问题。


1)通过SSAS表格导出列模特是最好的 或者通过sql查询[在视图中]派生列是最好的?


2)我们计划应用分区。分区是否有助于调整表格数据加载?或者当用户将查询命名为表格模型时向用户提供快速结果会有所帮助。


3)有多少并行用户可以访问此表格模型而没有内存问题?如果50个用户访问它,我们会得到任何内存问题吗?


4)表格模型db文件[在我的情况下是4 GB文件]是否在物理位置保存压缩格式的所有数据? / p>

请澄清。

解决方案


1)派生列通过SSAS表格模型是最好的 或者通过SQL查询[在视图中]派生列是最好的?


如果您打算进行分区,那么您只需要处理最近的分区,然后通过SQL视图派生列更好。这是因为每次
进程时,计算列都会重新计算表中每个分区的所有行。如果视图中的列将直接加载到正在处理的分区中,则其他分区不会受到影响。


2)我们计划应用分区。分区是否有助于调整表格数据加载?或者在用户查询表格模型时向用户提供快速结果会有所帮助。


分区仅影响数据加载。它对查询速度几乎没有影响,除非你创建了很多小分区,在这种情况下它实际上会损害性能


3)有多少并行用户可以访问此表格模型没有内存问题?如果50个用户访问它,我们会得到任何内存问题吗?


很难说它取决于如何优化您的模型以及运行什么类型的查询。对于具有100Gb RAM的服务器上的4Gb型号,50个用户应该不是问题。但与此同时,如果有人试图在所有50M记录和数百列中进行详细的数据提取,那么服务器必须尝试在将内存中的整个结果集发送到客户端之前实现,这可能需要超过100GB内存。


4)表格模型db文件[在我的情况下是4 GB文件]是否在物理位置保存压缩格式的所有数据?


如果您正在谈论.db 文件夹下的数据目录,然后是,这是所有压缩数据。服务器始终将所有这些数据保存在内存中,磁盘上的文件夹仅在因任何原因重新启动表格服务时使用。
此时服务器会将模型从.db文件夹重新加载到内存中。





We built tabular model using SSAS 2016.

The model is like single fact[around 300 fields] and 30 dimensions[100 fields]. we used views to fetch data from each table for tabular model.

Fact holds data nearly 350 GB[around 50 million data]. In tabular model we derived extra columns around 400 fields through fact view. We are haven't used any derived column in SSAS.

overall we can say 800 fields in tabular model. We have around 100 GB RAM. Tabular model DB file size showing as 4 GB in physical drive after processing it.

I have few questions.

1) Deriving column through SSAS tabular model is best  or Deriving column through sql query[in view] is best ?

2) We are planning to apply partitioning. Will the partitioning helps to tune tabular data load ? or will it be helpful to give fast result to the users when users hit the queries to tabular model.

3) How many parallel users can access this tabular model without memory issue? if 50 users access it will we get any memory issue?

4) Is Tabular model db file[4 GB file in my case] holds all data in compressed format in the physical location ?

Please clarify.

解决方案

1) Deriving column through SSAS tabular model is best  or Deriving column through sql query[in view] is best ?

If you are planning on doing partitioning so that you only have to process recent partitions, then deriving columns through a SQL view is better. This is because calculated columns re-calculate over all rows in every partition in the table every time you process. Where as columns from a view will be loaded directly into the partition being processed, other partitions will not be affected.

2) We are planning to apply partitioning. Will the partitioning helps to tune tabular data load ? or will it be helpful to give fast result to the users when users hit the queries to tabular model.

Partitioning only affects the data load. It has little impact on query speed unless you create lots of small partitions in which case it can actually hurt performance

3) How many parallel users can access this tabular model without memory issue? if 50 users access it will we get any memory issue?

It's really hard to say it depends on how well optimized your model is and what sort of queries are being run. 50 users should not be a problem for a 4Gb model on a server with 100Gb RAM. But at the same time if someone tries to do a detailed data extraction of all 50M records and hundreds of columns the server will have to try to materialize the entire resultset in memory before sending it to the client which could require more than 100GB of memory.

4) Is Tabular model db file[4 GB file in my case] holds all data in compressed format in the physical location ?

if you are talking about the .db folder under the data directory, then yes this is all the compressed data. The server always holds all this data in memory, the folder on disk is only used if the Tabular service is restarted for any reason. At which point the server will reload the model into memory from the .db folder.


这篇关于SSAS表格模型中的分区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆