Azure AKS监控-自定义仪表板资源 [英] Azure AKS Monitoring - custom dashboard resources

查看:272
本文介绍了Azure AKS监控-自定义仪表板资源的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图用AKS集群的一些特定数据创建一个自定义仪表板.我想做的是组装一个仪表板,其中包含每个选定控制器和节点的RAM和CPU使用情况的图形,并在可能的情况下每个Pod重新启动的次数.如何使用控制器的平均资源使用量创建自定义图?

I am trying to create a custom dashboard with some specific data for the AKS cluster. What i would like to do is assemble a dashboard with a graph of the RAM and CPU usage per selected controllers and nodes, and if possible number of restarts per pod. How can i create a custom graphs with the controllers average resource usage ?

推荐答案

您可以单击Azure门户中AKS群集刀片服务器左侧的日志"链接(通过先单击"Insights"来确保启用了Insights) -如果可以,您将看到与您想要的图表接近的图表,否则,您将看到入职说明).

You can click on "Logs" link on the left on your AKS cluster blade in Azure Portal (make sure you have Insights enabled by clicking on "Insights" first - if it is Ok, you'll see charts close to what you want otherwise, you'll see onboarding instructions).

使用以下查询来绘制给定控制器中所有容器的CPU利用率(95%tile):

Use the following query to chart CPU utilization (95th %-tile) for all containers in a given controller:

let endDateTime = now();
let startDateTime = ago(14d);
let trendBinSize = 1d;
let capacityCounterName = 'cpuLimitNanoCores';
let usageCounterName = 'cpuUsageNanoCores';
let clusterName = 'coin-test-i';
let controllerName = 'kube-svc-redirect';
KubePodInventory
| where TimeGenerated < endDateTime
| where TimeGenerated >= startDateTime
| where ClusterName == clusterName
| where ControllerName == controllerName
| extend InstanceName = strcat(ClusterId, '/', ContainerName), 
         ContainerName = strcat(controllerName, '/', tostring(split(ContainerName, '/')[1]))
| distinct Computer, InstanceName, ContainerName
| join hint.strategy=shuffle (
    Perf
    | where TimeGenerated < endDateTime
    | where TimeGenerated >= startDateTime
    | where ObjectName == 'K8SContainer'
    | where CounterName == capacityCounterName
    | summarize LimitValue = max(CounterValue) by Computer, InstanceName, bin(TimeGenerated, trendBinSize)
    | project Computer, InstanceName, LimitStartTime = TimeGenerated, LimitEndTime = TimeGenerated + trendBinSize, LimitValue
) on Computer, InstanceName
| join kind=inner hint.strategy=shuffle (
    Perf
    | where TimeGenerated < endDateTime + trendBinSize
    | where TimeGenerated >= startDateTime - trendBinSize
    | where ObjectName == 'K8SContainer'
    | where CounterName == usageCounterName
    | project Computer, InstanceName, UsageValue = CounterValue, TimeGenerated
) on Computer, InstanceName
| where TimeGenerated >= LimitStartTime and TimeGenerated < LimitEndTime
| project Computer, ContainerName, TimeGenerated, UsagePercent = UsageValue * 100.0 / LimitValue
| summarize P95 = percentile(UsagePercent, 95) by bin(TimeGenerated, trendBinSize) , ContainerName
| render timechart

用所需的名称替换集群名称和控制器名称.您还可以使用开始/结束时间参数,仓位大小,最大/最小/平均来代替95%瓷砖.

Replace cluster name and controller name with the ones you want. You can also play with start/end time parameters, bin sizes, max/min/avg in place of 95th %-tile.

对于内存指标,将指标名称替换为:

For memory metrics replace metric names with:

let capacityCounterName = 'memoryLimitBytes';
let usageCounterName = 'memoryRssBytes';

这篇关于Azure AKS监控-自定义仪表板资源的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆