如何评估Application Insights请求“拥有"请求?持续时间,而不考虑依赖的持续时间? [英] How to evaluate Application Insights requests "own" duration, without considering duration of dependencies?
问题描述
我正尝试产生一个Kusto查询来衡量自己的"广告资源.请求的持续时间(减去依赖项的持续时间).但是,我真的无法弄清楚如何通过一个纯Kusto查询来解决这个问题.
I'm trying to produce a Kusto query to measure the "own" duration of the requests (subtracting out durations of dependencies). However, I can't really figure out how to work this out through a pure Kusto query.
在一个示例案例下,为了更好地理解预期结果:
To better understand what would would expected, below a sample case:
高级视图(其中R是请求,Dx是依赖项)
R =============================== (31ms)
D1 ******* (7ms)
D2 ******** (8ms)
D3 ****** (6ms)
D4 ** (2ms)
D5 **** (4ms)
Proj ==*************======******====
-
D1
在2毫秒内与D2
重叠 -
D5
和D4
不应被其他依赖项完全重叠 -
Proj
是潜在中间步骤的投影,其中仅显示有意义的依赖项段 D1
overlapsD2
during 2msD5
andD4
shouldn't be taken into account as completely overlapped by other dependenciesProj
being a projection of a potential intermediate step where only meaningful dependencies segments are shown
给出以下测试平台数据集
Given the following testbed dataset
let reqs = datatable (timestamp: datetime, id:string, duration: real)
[
datetime("2020-12-15T08:00:00.000Z"), "r1", 31 // R
];
let deps = datatable (timestamp: datetime, operation_ParentId:string, duration: real)
[
datetime("2020-12-15T08:00:00.002Z"), "r1", 7, // D1
datetime("2020-12-15T08:00:00.007Z"), "r1", 8, // D2
datetime("2020-12-15T08:00:00.021Z"), "r1", 6, // D3
datetime("2020-12-15T08:00:00.023Z"), "r1", 2, // D4
datetime("2020-12-15T08:00:00.006Z"), "r1", 4, // D5
];
在这种特殊情况下,连接两个数据表的Kusto查询应该能够检索 12
(请求的持续时间,删除所有依赖项),即
In this particular case, the Kusto query, joining the two data tables, should be able to retrieve 12
(duration of the request, removing all dependencies), ie.
Expected total duration = 31 - (7 + 8 - 2) - (6) = 12
任何帮助推动这一进展的人,将不胜感激< 3
Any help to move this forward would be greatly appreciated <3
推荐答案
I succeeded to solve that using that using row_window_session()
. This is a Window function. You can read more about it at Window functions overview.
解决方案是:
let reqs = datatable (timestamp: datetime, operation_ParentId:string, duration: real)
[
datetime("2020-12-15T08:00:00.000Z"), "r1", 31 // R
];
let deps = datatable (timestamp: datetime, operation_ParentId:string, duration: real)
[
datetime("2020-12-15T08:00:00.002Z"), "r1", 7, // D1
datetime("2020-12-15T08:00:00.007Z"), "r1", 8, // D2
datetime("2020-12-15T08:00:00.021Z"), "r1", 6, // D3
datetime("2020-12-15T08:00:00.006Z"), "r1", 4, // D5
datetime("2020-12-15T08:00:00.023Z"), "r1", 2, // D4
];
deps
| extend endTime = timestamp + totimespan(duration * 10000)
| sort by timestamp asc
| serialize | extend SessionStarted = row_window_session(timestamp, 1h, 1h, timestamp > prev(endTime))
| summarize max(endTime) by operation_ParentId, SessionStarted
| extend diff = max_endTime - SessionStarted
| summarize todouble(sum(diff)) by operation_ParentId
| join reqs on operation_ParentId
| extend diff = duration - sum_diff / 10000
| project diff
这里的想法是按开放时间对条目进行排序,只要下一个上一个结束时间晚于当前开始时间,我们就不会打开新的会话.让我们解释一下此查询的每一行,以了解如何完成此操作:
The idea here is to sort the entries by the open time, and as long as the next previous end time is later than the current start time, we don't open a new session. Let's explain each line of this query to see how this is being done:
- 根据持续时间计算
endTime
.为了规范化数据,我将持续时间乘以10000:
- Calculate the
endTime
based on the duration. To normalize the data I'll multiply by 10000 the duration:
| extend endTime = timestamp + totimespan(duration * 10000)
| sort by timestamp asc
timestamp
列上计算.接下来的两个参数是启动新存储桶的限制.由于我们不想根据已过去的时间来密封存储桶,因此我提供了1小时,此输入不会造成影响.第四个参数可帮助我们基于数据创建一个新会话.只要有更多行会导致 timestamp>prev(endTime)
,它们将具有相同的开始时间.
timestamp
column. The next two parameters are limits when to start new buckets. Since we don't want to seal a bucket based on time that have passed, I provided 1 hour which will not hit with this input. The forth argument helps us to create a new session based on the data. As long as there are more rows that will result in timestamp > prev(endTime)
they will have the same start time.
| serialize | extend SessionStarted = row_window_session(timestamp, 1h, 1h, timestamp > prev(endTime))
operation_ParentId
以便以后在该键上加入:
operation_ParentId
to later on join on that key:
| summarize max(endTime) by operation_ParentId, SessionStarted
| extend diff = max_endTime - SessionStarted
| summarize todouble(sum(diff)) by operation_ParentId
req
以获得总的开始时间:
| join reqs on operation_ParentId
| extend diff = duration - sum_diff / 10000
| project diff
您可以找到这个查询在的 Kusto Samples打开数据库.
You can find this query running at Kusto Samples open database.
话虽如此,请注意这是线性操作.这意味着如果存在以下两个段,则这些段应该在同一段下,但是它们不相交,则它将失败.例如,将以下内容添加到 deps
:
Having said that, please note that this is a linear operation. Meaning that if there are 2 following segments, that should be under the same segment, but they do not intersect, it will fail. For example, adding the following into deps
:
datetime("2020-12-15T08:00:00.026Z"), "r1", 1, // D6
不应在计算中添加任何内容,从而导致其行为异常.这是因为 d4
是上一点,并且尽管 d3
涵盖了两者,但它与 d6
没有任何接触点.要解决此问题,您需要重复步骤3-5的相同逻辑.不幸的是,Kusto没有递归,因此您无法为任何类型的输入解决该问题.但是,假设没有真正深度的案例可以打破这种逻辑,我认为这已经足够了.
which should not add anything to the calculation, cause it to misbehave. This is because d4
is the previous point, and it has no point of contact with d6
, although d3
covers them both.
To solve that, you need to repeat the same logic of steps 3-5. Unfortunately Kusto does not have recursions, therefore you cannot solve this for any kind of input. But assuming there are no really depth such cases that breaks this logic, I think it is good enough.
这篇关于如何评估Application Insights请求“拥有"请求?持续时间,而不考虑依赖的持续时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!