Pod CPU节流 [英] Pod CPU Throttling

查看:149
本文介绍了Pod CPU节流的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Kubernetes中使用 CPU请求/限制时,我遇到了奇怪的问题.在完全设置任何CPU请求/限制之前,我所有的服务都执行得很好.我最近开始放置一些资源配额,以避免将来出现资源匮乏的情况.这些值是根据这些服务的实际使用情况设置的,但是令我惊讶的是,添加了这些服务后,某些服务开始大幅增加其响应时间.我的第一个猜测是,我可能放置了错误的请求/限制",但查看指标后发现,实际上面临该问题的所有服务都没有接近这些值.实际上,其中一些更接近要求而不是限制.

I'm experiencing a strange issue when using CPU Requests/Limits in Kubernetes. Prior to setting any CPU Requests/Limits at all, all my services performed very well. I recently started placing some Resource Quotas to avoid future resource starvation. These values were set based in the actual usage of those services, but to my surprise, after those were added, some services started to increase their response time drastically. My first guess was that I might placed wrong Requests/Limits, but looking at the metrics revealed that in fact none of the services facing this issue were near those values. In fact, some of them were closer to the Requests than the Limits.

然后,我开始查看CPU限制指标,发现我的所有Pod都已受到限制.然后,我将其中一项服务的限制从250m增加到1000m(从250m减少),但是我不明白为什么如果容器未达到其原来的限制(250m),我应该设置更高的限制).

Then I started looking at CPU throttling metrics and found that all my pods are being throttled. I then increased the limits for one of the services to 1000m (from 250m) and I saw less throttling in that pod, but I don't understand why I should set that higher limit if the pod wasn't reaching its old limit (250m).

所以我的问题是:如果我没有达到CPU限制,为什么我的吊舱会受到限制?如果Pod没有充分利用容量,为什么我的响应时间会增加?

这里有一些有关我的指标的屏幕截图(CPU请求:50m,CPU限制:250m):

Here there are some screenshots of my metrics (CPU Request: 50m, CPU Limit: 250m):

CPU使用率(在这里我们可以看到此Pod的CPU从未达到其250m的限制):

CPU节制:

将此吊舱的限制设置为 1000m 后,我们可以观察到更少的节流

After setting limits to this pod to 1000m, we can observe less throttling

kubectl top

P.S:在设置这些请求/限制之前,根本没有节流(如预期)

P.S: Before setting these Requests/Limits there wasn't throttling at all (as expected)

P.S 2:我的所有节点都没有面临高使用率.实际上,它们都不在任何时候都使用超过50%的CPU.

P.S 2: None of my nodes are facing high usage. In fact, none of them are using more than 50% of CPU at any time.

提前谢谢!

推荐答案

如果您看到

If you see the documentation you see when you issue a Request for CPUs it actually uses the --cpu-shares option in Docker which actually uses the cpu.shares attribute for the cpu,cpuacct cgroup on Linux. So a value of 50m is about --cpu-shares=51 based on the maximum being 1024. 1024 represents 100% of the shares, so 51 would be 4-5% of the share. That's pretty low, to begin with. But the important factor here is that this relative to how many pods/container you have on your system and what cpu-shares those have (are they using the default).

因此,假设您的节点上有另一个共享1024共享的容器/容器(这是默认设置),并且您具有此共享4-5共享的容器/容器.然后,此容器将获得约0.5%的CPU,而另一个容器/容器将获得 获得大约99.5%的CPU(如果没有限制).因此,这又取决于节点上有多少个吊舱/容器以及它们的份额.

So let's say that on your node you have another pod/container with 1024 shares which is the default and you have this pod/container with 4-5 shares. Then this container will get about get about 0.5% CPU, while the other pod/container will get about 99.5% of the CPU (if it has not limits). So again it all depends on how many pods/container you have on the node and what their shares are.

此外,

Also, not very well documented in the Kubernetes docs, but if you use Limit on a pod it's basically using two flags in Docker: --cpu-period and --cpu--quota which actually uses the cpu.cfs_period_us and the cpu.cfs_quota_us attributes for the cpu,cpuacct cgroup on Linux. This was introduced to the fact that cpu.shares didn't provide a limit so you'd spill over cases where containers would grab most of the CPU.

因此,就此限制而言,如果在同一节点上还有其他没有限制(或更高限制)但具有更高cpu.shares的容器,您将永远无法达到它,因为它们最终将进行优化并选择空闲的CPU.这可能是您所看到的,但再次取决于您的具体情况.

So, as far as this limit is concerned you will never hit it if you have other containers on the same node that don't have limits (or higher limits) but have a higher cpu.shares because they will end up optimizing and picking idle CPU. This could be what you are seeing, but again depends on your specific case.

以上所有这里.

这篇关于Pod CPU节流的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆