关联独立事件的顺序-计算时间相交 [英] Correlate Sequences of Independent Events - Calculate Time Intersection
问题描述
我们正在构建PowerBI报告解决方案,我(以及Stack)解决了 ,企业提出了新的报告思路。由于我对PowerBI知之甚少,因此尚不确定最好的方法。
We are building a PowerBI reporting solution and I (well Stack) solved one problem and the business came up with a new reporting idea. Not sure of the best way to approach it as I know very little about PowerBI and the business seems to want quite complex reports.
我们有来自不同数据的两个事件序列资料来源。它们都包含车辆发生的独立事件。一个描述车辆在什么位置,另一个描述具有事件原因代码的事件。该企业希望报告由于各种原因在每个位置花费的时间。车辆可以完全独立于发生的事件事件来更改位置-事件实际上是日期时间,并且在整个一天的随机时间发生。每种类型的事件都有一个Startime / Endtime和一个VehicleID。
We have two sequences of events from separate data sources. They both contain independent events occurring to vehicles. One describes what location a vehicle is within - the other describes incident events which have a reason code for the incident. The business wants to report on time spent in each location for each reason. Vehicles can change location totally independent of the incident events occurring - and events actually are datetime and occur at random points throughtout day. Each type of event has a startime/endtime and a vehicleID.
车辆位置事件
+------------------+-----------+------------+-----------------+----------------+
| LocationDetailID | VehicleID | LocationID | StartDateTime | EndDateTime |
+------------------+-----------+------------+-----------------+----------------+
| 1 | 1 | 1 | 2012-1-1 | 2016-1-1 |
| 2 | 1 | 2 | 2016-1-1 | 2016-4-1 |
| 3 | 1 | 1 | 2016-4-1 | 2016-11-1 |
| 4 | 2 | 1 | 2011-1-1 | 2016-11-1 |
+------------------+-----------+------------+-----------------+----------------+
车辆状态事件
+---------+---------------+-------------+-----------+--------------+
| EventID | StartDateTime | EndDateTime | VehicleID | ReasonCodeID |
+---------+---------------+-------------+-----------+--------------+
| 1 | 2012-1-1 | 2013-1-1 | 1 | 1 |
| 2 | 2013-1-1 | 2015-1-1 | 1 | 3 |
| 3 | 2015-1-1 | 2016-5-1 | 1 | 4 |
| 4 | 2016-5-1 | 2016-11-1 | 1 | 2 |
| 5 | 2015-9-1 | 2016-2-1 | 2 | 1 |
+---------+---------------+-------------+-----------+--------------+
无论如何,我可以将两个流关联在一起,并根据每个位置的ReasonCode计算每辆车的总时间吗?这似乎要求我能够关联这两个事件-因此,在给定的ReasonCode中可能会发生位置更改。
Is there anyway I can correlate the two streams together and calculate total time per Vehicle per ReasonCode per location? This would seem to require me to be able to relate the two events - so a change of location may occur part way through a given ReasonCode.
计算示例ReasonCodeID 4
- VehicleID 1的位置ID为2012-1-1至2016-1-1和
2016-4-1至2016-11-1 - VehicleID 1位于位置ID 2中,从2016-1-1
至2016-4-1 - VehcileID 1具有从2015-1-1到
2016-5-1的ReasonCodeID 4
因此,位置1中的第一个期间与365天的ReasonCodeID 4相交(2015-1-1至2016-1-1)。位置1中的第二个周期与30天(2016-4-1至2016-5-1)相交。
在位置2与91天的ReasonCodeID 4相交(2016-1-1至2016-4-1
Therefore first Period in location 1 intersects with 365 days of ReasonCodeID 4 (2015-1-1 to 2016-1-1). 2nd period in location 1 intersects with 30 days (2016-4-1 to 2016-5-1). In location 2 intersects with 91 days of ReasonCodeID 4(2016-1-1 to 2016-4-1
期望的输出如下)。 p>
Desired output would be the below.
+-----------+--------------+------------+------------+
| VehicleID | ReasonCodeID | LocationID | Total Days |
+-----------+--------------+------------+------------+
| 1 | 1 | 1 | 366 |
| 1 | 3 | 1 | 730 |
| 1 | 4 | 1 | 395 |
| 1 | 4 | 2 | 91 |
| 1 | 2 | 1 | 184 |
| 2 | 1 | 1 | 154 |
+-----------+--------------+------------+------------+
我创建了一个SQL提琴,该提琴显示了结构在这里
I have created a SQL fiddle that shows the structure here
车辆有相关表格,我敢肯定生意会nt按车辆类别等对它们进行分组,但是如果我能理解这种情况下的交点,那将为我提供其余报告的基础。
Vehicles have related tables and I'm sure the business will want them grouped by vehicle class etc but if I can understand how to calculate the intersection points in this case that would give me the basis for rest of reporting.
推荐答案
我认为此解决方案需要 CROSS JOIN
实现。两个表之间的关系是多对多
,这意味着创建了第三个表,它们桥接了 LocationEvents
和 VehicleStatusEvents
表,所以我认为在表达式中指定关系会更容易。
I think this solution requires a CROSS JOIN
implementation. The relationship between both tables is Many to Many
which implies the creation of a third table that bridges LocationEvents
and VehicleStatusEvents
tables so I think specifying the relationship in the expression could be easier.
我在两个表之间都使用了CROSS JOIN,然后仅过滤结果以获取两个表中VehicleID列相同的行。我还过滤了VehicleStatusEvents范围日期与LocationEvents范围日期相交的行。
I use a CROSS JOIN between both tables, then filter the results only to get those rows which VehicleID columns are the same in both tables. I am also filtering the rows that VehicleStatusEvents range dates intersects LocationEvents range dates.
一旦过滤完成,我将添加一列以计算每个交叉点之间的天数。最后,该度量汇总每个VehicleID,ReasonCodeID和LocationID的天数。
Once the filtering is done I am adding a column to calculate the count of days between each intersection. Finally, the measure sums up the days for each VehicleID, ReasonCodeID and LocationID.
要实施交叉加入,您必须重命名两个表中任何一个上的VehicleID
, StartDateTime
和 EndDateTime
。为了避免歧义的列名错误,这是必要的。
In order to implement the CROSS JOIN you will have to rename the VehicleID
, StartDateTime
and EndDateTime
on any of both tables. It is necessary for avoiding ambigous column names errors.
我按以下方式重命名列:
I rename the columns as follows:
VehicleID
: LocationVehicleID
和 StatusVehicleID
StartDateTime
: LocationStartDateTime
和 StatusStartDateTime
EndDateTime
: LocationEndDateTime
和 StatusEndDateTime
VehicleID
: LocationVehicleID
and StatusVehicleID
StartDateTime
: LocationStartDateTime
and StatusStartDateTime
EndDateTime
: LocationEndDateTime
and StatusEndDateTime
之后,您可以在总天数
度量中使用CROSSJOIN:
After this you can use CROSSJOIN in the Total Days
measure:
Total Days =
SUMX (
FILTER (
ADDCOLUMNS (
FILTER (
CROSSJOIN ( LocationEvents, VehicleStatusEvents ),
LocationEvents[LocationVehicleID] = VehicleStatusEvents[StatusVehicleID]
&& LocationEvents[LocationStartDateTime] <= VehicleStatusEvents[StatusEndDateTime]
&& LocationEvents[LocationEndDateTime] >= VehicleStatusEvents[StatusStartDateTime]
),
"CountOfDays", IF (
[LocationStartDateTime] <= [StatusStartDateTime]
&& [LocationEndDateTime] >= [StatusEndDateTime],
DATEDIFF ( [StatusStartDateTime], [StatusEndDateTime], DAY ),
IF (
[LocationStartDateTime] > [StatusStartDateTime]
&& [LocationEndDateTime] >= [StatusEndDateTime],
DATEDIFF ( [LocationStartDateTime], [StatusEndDateTime], DAY ),
IF (
[LocationStartDateTime] <= [StatusStartDateTime]
&& [LocationEndDateTime] <= [StatusEndDateTime],
DATEDIFF ( [StatusStartDateTime], [LocationEndDateTime], DAY ),
IF (
[LocationStartDateTime] >= [StatusStartDateTime]
&& [LocationEndDateTime] <= [StatusEndDateTime],
DATEDIFF ( [LocationStartDateTime], [LocationEndDateTime], DAY ),
BLANK ()
)
)
)
)
),
LocationEvents[LocationID] = [LocationID]
&& VehicleStatusEvents[ReasonCodeID] = [ReasonCodeID]
),
[CountOfDays]
)
然后在Power BI中,您可以使用此度量来构建矩阵(或任何其他可视化对象):
Then in Power BI you can build a matrix (or any other visualization) using this measure:
如果您不完全理解度量表达式,则为T-SQL翻译:
If you don't understand completely the measure expression, here is the T-SQL translation:
SELECT
dt.VehicleID,
dt.ReasonCodeID,
dt.LocationID,
SUM(dt.Diff) [Total Days]
FROM
(
SELECT
CASE
WHEN a.StartDateTime <= b.StartDateTime AND a.EndDateTime >= b.EndDateTime -- Inside range
THEN DATEDIFF(DAY, b.StartDateTime, b.EndDateTime)
WHEN a.StartDateTime > b.StartDateTime AND a.EndDateTime >= b.EndDateTime -- |-----|*****|....|
THEN DATEDIFF(DAY, a.StartDateTime, b.EndDateTime)
WHEN a.StartDateTime <= b.StartDateTime AND a.EndDateTime <= b.EndDateTime -- |...|****|-----|
THEN DATEDIFF(DAY, b.StartDateTime, a.EndDateTime)
WHEN a.StartDateTime >= b.StartDateTime AND a.EndDateTime <= b.EndDateTime -- |---|****|-----
THEN DATEDIFF(DAY, a.StartDateTime, a.EndDateTime)
END Diff,
a.VehicleID,
b.ReasonCodeID,
a.LocationID --a.StartDateTime, a.EndDateTime, b.StartDateTime, b.EndDateTime
FROM LocationEvents a
CROSS JOIN VehicleStatusEvents b
WHERE a.VehicleID = b.VehicleID
AND
(
(a.StartDateTime <= b.EndDateTime)
AND (a.EndDateTime >= b.StartDateTime)
)
) dt
GROUP BY dt.VehicleID,
dt.ReasonCodeID,
dt.LocationID
在T-SQL中请注意,您也可以使用 INNER JOIN
运算符。
Note in T-SQL you could use an INNER JOIN
operator too.
让我知道如果有帮助。
这篇关于关联独立事件的顺序-计算时间相交的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!