DAX Measure:仅首先动态考虑每个组,然后总体返回其计数 [英] DAX Measure: dynamically consider only first for each group and and overall returns their count

查看:15
本文介绍了DAX Measure:仅首先动态考虑每个组,然后总体返回其计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们正在努力解决一个可能比我想象的更简单的问题.我只是卡住了.

问题在于定义一个DAX Measure动态只计算每个患者在期间第一次访问>.在此期间不是第一次的访问必须分配 BLANK() 值.请在讨论目标之前让我介绍一下场景.

  1. 场景:
    我们有一个由一个事实表 (F_Visits) 和两个维度 (D_Customer; D_Calendar) 组成的模型.F_Visits 有 1M 条记录,包含客户对服装店的所有访问.它的字段是:
    Visit Id:增量唯一编号,是表的自然键
    Customer Fk:事实表中维度 D_Customer 的外键
    Calendar Fk:事实表中维度 D_Calendar 的外键
    数量:访问中购买的布片数量
    金额:访问中花费的金额
    Seller:为客户服务的员工姓名

  2. 目标:
    创建一个度量:

    • 动态(基于时间段选择)
    • 对于每个客户的每次就诊如果是该患者在所选时间段内的第一次就诊,则返回值 1(该时间段内的最小就诊 ID),忽略每个客户在所选时间段内不是第一次的访问
    • 总体(总计)此度量需要返回所选时间段内所有患者首次就诊的总数

过去两周我一直在搜索在线社区,但没有发现像我这样的问题.一个类似(但不同)的如下:

一个版本基于询问我们是否可以做一个 COUNTROWS 版本而不是 DISTINCTCOUNT 的评论.VAR 将是相同的,所以我只是在 RETURN 之后显示计数.

<代码>...返回//计算与其中一个患者就诊对相关联的就诊 ID//患者首次就诊乡村(相交 (GROUPBY('F_visits','F_visits'[客户 FK],'F_visits'[访问 ID]),患者至上))

这里我们只是将 VAR 表和上下文中的值与事实相交.该相交为 1 行或详细信息行为空,它是未经F_Visits"[Visit Id] 过滤的总数.

We are struggling in trying to solve a problem that might simpler than I think. I am just stuck.

The problem consists of defining a DAX Measure that dynamically counts only first visits in the period for each patient. Visits that are not first in the period must be assigned BLANK() value. Please before discussing the goal let me introduce the scenario.

  1. Scenario:
    We have a model composed of one fact table (F_Visits) and two dimensions (D_Customer; D_Calendar). F_Visits has 1M records and contains all visits made by customers into the Clothing shop. Its fields are:
    Visit Id: Incremental unique number which is the natural key of the table
    Customer Fk: Foreign Key of dimension D_Customer in the fact table
    Calendar Fk: Foreign Key of dimension D_Calendar in the fact table
    Quantity: Number of cloth pieces purchased in the visit
    Amount: Amount of dollars spent in the visit
    Seller: Employee name who served the customer

  2. Goal:
    Create a Measure that:

    • dynamically (based on the time period selection)
    • for each visit within each customer return the value 1 if it is the first for that patient in the selected period (minimum Visit Id in the period), ignoring visits that are not the first in the selected time period for each customer
    • overall (in total) this measure needs to return count of total number of first visits made by all patients in the selected time period

I have been searching on online communities for the last two weeks but found no problem like mine. A similar (but different) one is the following: Sum distinct values for first occurance in Power BI The main difference compared to that question is the measure for this problem needs to be BLANK() for visits that are not the first of that patient in the selected period.
3. Example:

+----------+-------------+-------------+----------+--------+--------+
| Visit Id | Customer FK | Calendar FK | Quantity | Amount | Seller |
+----------+-------------+-------------+----------+--------+--------+
|        1 | John        |    20170101 |        1 |     10 | Rick   |
|        2 | John        |    20180101 |        2 |     15 | Morty  |
|        3 | John        |    20180101 |        3 |     17 | Eric   |
|        4 | John        |    20190101 |        2 |     17 | Eric   |
|        5 | Mark        |    20170101 |        1 |     17 | Eric   |
|        6 | Mark        |    20180101 |        3 |     12 | Eric   |
|        7 | Jack        |    20190101 |        0 |      0 | Rick   |
+----------+-------------+-------------+----------+--------+--------+

  1. Solution: The user now filter for Calendar FK = 2018. This will exclude some visit and the measure is dynamically calculated to count only first vists in the selected period for each patient:

Detail Table:

+----------+-------------+-------------+----------+--------+--------+---------+
| Visit Id | Customer FK | Calendar FK | Quantity | Amount | Seller | Measure |
+----------+-------------+-------------+----------+--------+--------+---------+
|        2 | John        |    20180101 |        2 |     15 | Morty  | 1       |
|        3 | John        |    20180101 |        3 |     17 | Eric   | BLANK() |
|        6 | Mark        |    20180101 |        3 |     12 | Eric   | 1       |
+----------+-------------+-------------+----------+--------+--------+---------+

Total:

+-------+
| Total |
+-------+
|     2 |
+-------+

Total by Seller:

+--------+---------+
| Seller | Measure |
+--------+---------+
| Morty  | 1       |
| Eric   | 1       |
| Rick   | BLANK() |
+--------+---------+

There is no specific requirements about the approach except it needs to be dynamically calculated on front end.

Update

please refer to question DAX: avoid ALL() causing incorrect show of dimensional attribute in matrix

解决方案

First Visit =
// Make a table of all patients in context and their first visit id in context
VAR PatientFirstVisits =
  ADDCOLUMNS (
    VALUES ( 'F_Visits'[Customer Fk] ),
    "MinVisitId", CALCULATE (
        MIN ( 'F_Visits'[Visit Id] ),
        ALL ( 'F_Visits'[Visit Id], 'F_Visits'[Seller] ), // note: fragile
        ALLSELECTED ( 'D_Calendar' )
    )
  )
RETURN
  // Count the Visit Ids that are associated with one of the patient-visit pairs in
  // PatientFirstVisits
  CALCULATE ( 
    DISTINCTCOUNT ( 'F_Visits'[Visit Id] ),
    KEEPFILTERS ( TREATAS ( PatientFirstVisits, 'F_Visits'[Customer Fk], 'F_Visits'[Visit Id] ) )
  )

Note on the fragile part, it is a best practice to construct fact tables of only FKs and aggregateable facts, i.e. do not have any descriptive attributes in the fact table. The reason the commented part is fragile is that you will need to add to the ALL any new descriptive columns you add to the fact that may end up in a visual with this measure. If you pull seller out to a dim, you can refactor as below:

First Visit = 
VAR PatientFirstVisits =
  ADDCOLUMNS (
    VALUES ( 'F_Visits'[Customer Fk] ),
    "MinVisitId", CALCULATE (
        MIN ( 'F_Visits'[Visit Id] ),
        ALL ( 'F_Visits'[Visit Id] ),
        ALLSELECTED ( 'D_Calendar' ),
        ALL ( 'Seller' )
    )
  )
RETURN
  CALCULATE ( 
    DISTINCTCOUNT ( 'F_Visits'[Visit Id] ),
    KEEPFILTERS ( TREATAS ( PatientFirstVisits, 'F_Visits'[Customer Fk], 'F_Visits'[Visit Id] ) )
  )

Based on my experience, columns within a table are about an order of magnitude more volatile than tables within a model. What I mean by this is that I typically see much more churn in columns (adding, removing, refactoring in some way) than churn in tables. By pulling out a 'Seller' dim, you can just do ALL ( 'Seller' ) and you don't have to worry about tracking columns.

And here's the measure in action:

Edit: A version based on the comment asking if we can do a COUNTROWS version rather than DISTINCTCOUNT. The VAR would be the same, so I'm just showing the count after the RETURN.

...
RETURN
  // Count the Visit Ids that are associated with one of the patient-visit pairs in
  // PatientFirstVisits
  COUNTROWS ( 
    INTERSECT (
      GROUPBY( 'F_visits', 'F_visits'[Customer FK], 'F_visits'[Visit Id] ),
      PatientFirstVisits
    )
  )

Here we just intersect the VAR table and the values in context from the fact. That intersect is 1 row or empty for detail rows, and it is the total without being filtered by 'F_Visits'[Visit Id].

这篇关于DAX Measure:仅首先动态考虑每个组,然后总体返回其计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆