加入两个SCD Type2表,其中1:M关系与复合连接和& SSAS表格模型中的过滤条件 [英] Joining two SCD Type2 tables with 1:M relation with complex join & filter condition in SSAS Tabular model

查看:116
本文介绍了加入两个SCD Type2表,其中1:M关系与复合连接和& SSAS表格模型中的过滤条件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


假设我在下面有两个表Department和Employee,我在那里存储不同客户(租户)的数据,并且这两个表之间存在一对多(1:M)的关系表(即一个部门可以有一个或多个员工)



但是,假设这两个表都是SCD Type 2,即存储有效和终止日期的历史记录。在数据库级别上没有在这些表上创建约束,索引等。

  Department表:
cust_id dept_id dept_name efctv_dt trmntn_dt Dept_key
1001 D1 IT 12-01 -2018 12-31-9999 1001D1
1001 D2 HR 01-01-2019 12- 31-9999 1001D2
1002 D3 Admin 02-01-2019 02-28-2019 1002D3
1002 D3 HR + Admin 03-01-2019 12-31-9999 1002D3
1002 D4 Finance 02 -01-2019 12-31-9999 1002D4

员工表:
cust_id emp_id emp_name dept_id efctv_dt trmntn_dt Emp_key
1001 E1 XYZ D1 01-01-2019 01-31-2019 1001D1
1001 E1 XYZ-A D1 02-01-2019 12-31-9999 1001D1
1001 E2 ABC D2 02-01-2019 12-31-9999 1001D2
1002 E3 AXBYCZ D3 03 -01-2019 03-31-2019 1002D3
1002 E3 AXBYCZ D4 04-01-2019 12-31-9999 1002D4
1002 E4 DEFG D4 04-01-2019 12-31-9999 1002D4



列cust_id& dept_id可以作为关键字段在两个表中作为单独的列连接在一起,并用作两个表之间的连接。


< p style ="padding-right:0px;行高:继承; font-family:Arial,'Helvetica Neue',Helvetica,sans-serif;字体大小:15像素;垂直对齐:基线;明确:两者;颜色:#242729">
Department Key = Concatenate(department [cust_id],department [dept_id])



员工密钥=连接(员工[cust_id],员工[dept_id])



示例键输出值= 1001D1,1001D2,1002D3,1002D4



现在让我们说我们有以下报告要求,即



过滤日期范围(在可视化中) - 假设有另一个日期维度表,其中包含所有日期和层次结构



1)未选择特定日期范围或过滤器时 - 显示所有当前活动的员工和部门名称(其中,trmntn_dt = 12-31-9999)

因此,预期输出为:



Emp Name Dept Name

XYZ-A IT

ABC HR

AXBYCZ财务



2)报告特定月份示例时 - 2019年1月 - 显示截至该月的所有员工和部门名称。因此,预期输出为:



Emp Name Dept Name

XYZ IT



3)报告特定季度示例 - Q1-2019时 - 显示截至该季度的所有员工和部门名称。因此,预期输出为:



Emp Name Dept Name

XYZ-A IT

ABC HR

AXBYCZ HR + Admin



但是,AS表格模型中的连接条件与这两个表之间的1:M关系会失败,因为在Department表中的行不是唯一的(D3的行),这是在关系的一方。



如果在两个表中的连接连接条件中也包括efctv_dt或trmntn_dt作为加入的键,即


Department Key = Concatenate(department [cust_id],department [dept_id])& Concatenate(department [efctv_dt],""))



员工密钥=连接(员工[cust_id],员工[dept_id])& Concatenate(员工[efctv_dt],""))



示例键输出值= 1001D112-01-2018,1001D201-01-2019 ...



但是,虽然现在这些行是唯一的,因为我们不希望在同一天两次同一行(除非某些ETL问题,如进程在同一天运行两次等)。



AS表格模型不允许创建复杂的连接/条件(如在SAP BO Universe中),这样我们可以在加入这两个SCD类型2表时添加以下条件,如下所示可能有帮助解决一些要求 - 如下面的SQL的Where
子句:

  dept.cust_id = emp.cust_id 
and dept.dept_id = emp.dept_id
And(calendar_date介于efctv_dt和trmntn_dt

trmntn_dt =' 12-31-9999'



我认为创建/计算任何测量值仍然可以使用DAX上的大量在线实例来过滤日期,但是只有尺寸属性呢?



这是正确的方法吗?如何处理这些?并且,也没有使用代理键在每个SCD类型2表中生成唯一值并将其作为FK /引用键从父级引用 - > Child(1:M)ie Dept-> Emp table。



请建议。



谢谢!

解决方案



这是正确的方法吗?如何处理这些?并且,也没有使用代理键在每个SCD类型2表中生成唯一值并将其作为FK /引用键从父级引用 - >儿童(1:M)即Dept-> Emp表。


您还没有解释为什么您不想使用代理键,但是使用它们绝对是你所概述的方案的正确方法。你可以通过将部门和员工一起折叠成一个对象
使用视图来摆脱一层连接,但你仍然会面临多列连接回到事实表。使用代理键是SCD2维度的最佳方法,特别是对于表格模型。


Assuming, I have below two tables Department and Employee where I am storing data for different customers(tenants) and there’s one-to-many (1:M) relationship between these two tables (i.e. one department can have 1 or more employees)

However, let’s say both the tables are SCD Type 2 i.e. storing history with effective and termination dates. There are no constraints, indexes etc. created on these tables at database level.

Department table:
cust_id dept_id dept_name   efctv_dt    trmntn_dt   Dept_key
1001    D1      IT        12-01-2018    12-31-9999  1001D1
1001    D2      HR        01-01-2019    12-31-9999  1001D2
1002    D3      Admin     02-01-2019    02-28-2019  1002D3
1002    D3      HR+Admin  03-01-2019    12-31-9999  1002D3
1002    D4      Finance   02-01-2019    12-31-9999  1002D4

Employee table:
cust_id emp_id  emp_name    dept_id efctv_dt    trmntn_dt   Emp_key
1001    E1      XYZ          D1    01-01-2019   01-31-2019  1001D1
1001    E1      XYZ-A        D1    02-01-2019   12-31-9999  1001D1
1001    E2      ABC          D2    02-01-2019   12-31-9999  1001D2
1002    E3      AXBYCZ       D3    03-01-2019   03-31-2019  1002D3
1002    E3      AXBYCZ       D4    04-01-2019   12-31-9999  1002D4
1002    E4      DEFG         D4    04-01-2019   12-31-9999  1002D4

Columns cust_id & dept_id can be concatenated together as a separate column in both the tables as a key field and used as a join between both the tables.

Department Key=Concatenate(department[cust_id], department[dept_id] )

Employee Key=Concatenate(employee[cust_id], employee[dept_id] )

Example key output values= 1001D1, 1001D2, 1002D3, 1002D4

Now let’s say we have following reporting requirements, i.e.

To filter on Date Ranges (in visualization) - assuming there's another date dimension table with all dates & hierarchy

1) When no specific date range or filter selected - show all current active employee & departments names (where, trmntn_dt = 12-31-9999)
So, expected output is:

Emp Name Dept Name
XYZ-A IT
ABC HR
AXBYCZ Finance

2) When reporting for a specific month example - Jan-2019 - show all employees & department names active as of that month. So, expected output is:

Emp Name Dept Name
XYZ IT

3) When reporting for a specific Quarter example - Q1-2019 - show all employees & department names active as of that quarter. So, expected output is:

Emp Name Dept Name
XYZ-A IT
ABC HR
AXBYCZ HR+Admin

However, the join condition in AS Tabular model with 1:M relationship between these two tables would fail because the rows are not unique in the Department table (rows for D3), which is on the one side of the relationship.

If you include efctv_dt or trmntn_dt also in the concatenated join condition in both the tables as key for joining i.e.

Department Key=Concatenate(department[cust_id], department[dept_id] ) & Concatenate(department[efctv_dt],""))

Employee Key=Concatenate(employee[cust_id], employee[dept_id] ) & Concatenate(employee[efctv_dt],""))

Example key output values= 1001D112-01-2018, 1001D201-01-2019…

However, though now the rows would be unique since we don’t expect same row twice on the same day (unless some ETL issues like process ran twice on the same day etc.)

AS tabular model doesn’t allow to create complex join/condition (like in SAP BO Universe) so that we could add below condition when joining these two SCD type 2 tables as below which might help solve some of the requirements - something like below in the Where clause of the SQL:

    dept.cust_id = emp.cust_id
And dept.dept_id = emp.dept_id
And ( calendar_date is between efctv_dt and trmntn_dt 
        Or 
       trmntn_dt = ’12-31-9999’
    )

I think for creating/calculating any measure value is still doable with lots of examples available online on DAX to filter on the dates, but what about with just the dimensional attributes ?

Is this the right approach ? How to handle these? And, that too without using surrogate keys for generating unique values in each SCD type 2 tables and referencing it as FK/reference key from Parent->Child (1:M) i.e. Dept->Emp table.

Please suggest.

Thanks!

解决方案

Is this the right approach ? How to handle these? And, that too without using surrogate keys for generating unique values in each SCD type 2 tables and referencing it as FK/reference key from Parent->Child (1:M) i.e. Dept->Emp table.

You have not explained why you don't want to use surrogate keys, but using them is definitely the correct approach for the scenario you have outlined. You could get rid of one layer of joins by folding the department and employee together into a single object using a view, but you would still be faced with a multi-column join back to your fact table. Using surrogate keys is far and away the best approach for SCD2 dimensions, particularly with tabular models.


这篇关于加入两个SCD Type2表,其中1:M关系与复合连接和&amp; SSAS表格模型中的过滤条件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆