DAX度量:仅动态考虑每个组的优先级,然后整体返回其计数 [英] DAX Measure: dynamically consider only first for each group and and overall returns their count

查看:125
本文介绍了DAX度量:仅动态考虑每个组的优先级,然后整体返回其计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们正在努力解决可能比我想象的简单的问题。我只是被困住了。



问题在于定义一个动态 DAX度量在每个患者期间,首次访问次数。在此期间不是第一的访问必须是分配的BLANK()值。
请在讨论目标之前让我介绍一下情况。


  1. 场景:

    我们有一个由一个事实表组成的模型( F_Visits )和两个维度( D_Customer ; D_Calendar )。
    F_Visits 有1M记录,其中包含客户对服装店的所有访问。它的字段是:

    访问ID :递增的唯一数字,它是表的自然键

    Customer Fk :事实表中维D_Customer的外键

    Calendar Fk :维D_Calendar中的维事实表

    数量:拜访时购买的布匹数量

    金额:访问中花费的金额

    卖方:为客户提供服务的员工姓名


  2. 目标:

    创建一个衡量标准:




    • 动态地(基于时间段)选择)

    • 每位客户的每次访问(如果该值是第一个,则返回值) 1 所选期间(该期间的最低访问ID)中的患者,
      忽略了不是每个时间段内每个客户在所选时间段内都不是首次访问的访问

    • 整体(以t otal)。此度量需要返回所有患者在选定时间段内进行的首次就诊总数


最近两周我一直在在线社区中进行搜索,但没有发现像我这样的问题。
类似(但不同)的是以下内容:



编辑::基于评论的版本,询问我们是否可以执行 COUNTROWS 版本而不是 DISTINCTCOUNT 。 VAR是相同的,所以我只显示 RETURN 之后的计数。

  ... 
返回
//计算与
中的一对患者访问对相关的访问ID // PatientFirstVisits
COUNTROWS(
INTERSECT(
GROUPBY('F_visits','F_visits'[客户FK],'F_visits'[访问ID]),
PatientFirstVisits


在这里,我们只是将VAR表和事实中的值相交。该相交是1行,对于明细行则为空,它是总数,而没有被 F_Visits [Visit Id]过滤。


We are struggling in trying to solve a problem that might simpler than I think. I am just stuck.

The problem consists of defining a DAX Measure that dynamically counts only first visits in the period for each patient. Visits that are not first in the period must be assigned BLANK() value. Please before discussing the goal let me introduce the scenario.

  1. Scenario:
    We have a model composed of one fact table (F_Visits) and two dimensions (D_Customer; D_Calendar). F_Visits has 1M records and contains all visits made by customers into the Clothing shop. Its fields are:
    Visit Id: Incremental unique number which is the natural key of the table
    Customer Fk: Foreign Key of dimension D_Customer in the fact table
    Calendar Fk: Foreign Key of dimension D_Calendar in the fact table
    Quantity: Number of cloth pieces purchased in the visit
    Amount: Amount of dollars spent in the visit
    Seller: Employee name who served the customer

  2. Goal:
    Create a Measure that:

    • dynamically (based on the time period selection)
    • for each visit within each customer return the value 1 if it is the first for that patient in the selected period (minimum Visit Id in the period), ignoring visits that are not the first in the selected time period for each customer
    • overall (in total) this measure needs to return count of total number of first visits made by all patients in the selected time period

I have been searching on online communities for the last two weeks but found no problem like mine. A similar (but different) one is the following: Sum distinct values for first occurance in Power BI The main difference compared to that question is the measure for this problem needs to be BLANK() for visits that are not the first of that patient in the selected period.
3. Example:

+----------+-------------+-------------+----------+--------+--------+
| Visit Id | Customer FK | Calendar FK | Quantity | Amount | Seller |
+----------+-------------+-------------+----------+--------+--------+
|        1 | John        |    20170101 |        1 |     10 | Rick   |
|        2 | John        |    20180101 |        2 |     15 | Morty  |
|        3 | John        |    20180101 |        3 |     17 | Eric   |
|        4 | John        |    20190101 |        2 |     17 | Eric   |
|        5 | Mark        |    20170101 |        1 |     17 | Eric   |
|        6 | Mark        |    20180101 |        3 |     12 | Eric   |
|        7 | Jack        |    20190101 |        0 |      0 | Rick   |
+----------+-------------+-------------+----------+--------+--------+

  1. Solution: The user now filter for Calendar FK = 2018. This will exclude some visit and the measure is dynamically calculated to count only first vists in the selected period for each patient:

Detail Table:

+----------+-------------+-------------+----------+--------+--------+---------+
| Visit Id | Customer FK | Calendar FK | Quantity | Amount | Seller | Measure |
+----------+-------------+-------------+----------+--------+--------+---------+
|        2 | John        |    20180101 |        2 |     15 | Morty  | 1       |
|        3 | John        |    20180101 |        3 |     17 | Eric   | BLANK() |
|        6 | Mark        |    20180101 |        3 |     12 | Eric   | 1       |
+----------+-------------+-------------+----------+--------+--------+---------+

Total:

+-------+
| Total |
+-------+
|     2 |
+-------+

Total by Seller:

+--------+---------+
| Seller | Measure |
+--------+---------+
| Morty  | 1       |
| Eric   | 1       |
| Rick   | BLANK() |
+--------+---------+

There is no specific requirements about the approach except it needs to be dynamically calculated on front end.

Update

please refer to question DAX: avoid ALL() causing incorrect show of dimensional attribute in matrix

解决方案

First Visit =
// Make a table of all patients in context and their first visit id in context
VAR PatientFirstVisits =
  ADDCOLUMNS (
    VALUES ( 'F_Visits'[Customer Fk] ),
    "MinVisitId", CALCULATE (
        MIN ( 'F_Visits'[Visit Id] ),
        ALL ( 'F_Visits'[Visit Id], 'F_Visits'[Seller] ), // note: fragile
        ALLSELECTED ( 'D_Calendar' )
    )
  )
RETURN
  // Count the Visit Ids that are associated with one of the patient-visit pairs in
  // PatientFirstVisits
  CALCULATE ( 
    DISTINCTCOUNT ( 'F_Visits'[Visit Id] ),
    KEEPFILTERS ( TREATAS ( PatientFirstVisits, 'F_Visits'[Customer Fk], 'F_Visits'[Visit Id] ) )
  )

Note on the fragile part, it is a best practice to construct fact tables of only FKs and aggregateable facts, i.e. do not have any descriptive attributes in the fact table. The reason the commented part is fragile is that you will need to add to the ALL any new descriptive columns you add to the fact that may end up in a visual with this measure. If you pull seller out to a dim, you can refactor as below:

First Visit = 
VAR PatientFirstVisits =
  ADDCOLUMNS (
    VALUES ( 'F_Visits'[Customer Fk] ),
    "MinVisitId", CALCULATE (
        MIN ( 'F_Visits'[Visit Id] ),
        ALL ( 'F_Visits'[Visit Id] ),
        ALLSELECTED ( 'D_Calendar' ),
        ALL ( 'Seller' )
    )
  )
RETURN
  CALCULATE ( 
    DISTINCTCOUNT ( 'F_Visits'[Visit Id] ),
    KEEPFILTERS ( TREATAS ( PatientFirstVisits, 'F_Visits'[Customer Fk], 'F_Visits'[Visit Id] ) )
  )

Based on my experience, columns within a table are about an order of magnitude more volatile than tables within a model. What I mean by this is that I typically see much more churn in columns (adding, removing, refactoring in some way) than churn in tables. By pulling out a 'Seller' dim, you can just do ALL ( 'Seller' ) and you don't have to worry about tracking columns.

And here's the measure in action:

Edit: A version based on the comment asking if we can do a COUNTROWS version rather than DISTINCTCOUNT. The VAR would be the same, so I'm just showing the count after the RETURN.

...
RETURN
  // Count the Visit Ids that are associated with one of the patient-visit pairs in
  // PatientFirstVisits
  COUNTROWS ( 
    INTERSECT (
      GROUPBY( 'F_visits', 'F_visits'[Customer FK], 'F_visits'[Visit Id] ),
      PatientFirstVisits
    )
  )

Here we just intersect the VAR table and the values in context from the fact. That intersect is 1 row or empty for detail rows, and it is the total without being filtered by 'F_Visits'[Visit Id].

这篇关于DAX度量:仅动态考虑每个组的优先级,然后整体返回其计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆