Docker守护进程/容器与Ubuntu(Linux)主机的实时调度 [英] Docker daemon/container real-time scheduling with Ubuntu (Linux) host
问题描述
在我开始之前,对于应该在超级用户还是Stackoverflow中提出这个问题,我犹豫不决-如果它位于错误的位置,请提前道歉。
我有一个停靠容器(包含C/C++可执行代码),它执行音频/视频处理。因此,我想测试一下使用RT调度约束运行容器的好处。在网上搜索,我发现了各种各样的信息,但我很难把所有的信息拼凑在一起。
系统环境:
- 主机:Ubuntu(现货)Zstest 17.04(无RT内核补丁,内核:4.10.0-35-genic)
- Docker版本:17.05.0-ce
- Docker Images OS:Ubuntu Zsity 17.04。
在停靠器映像/容器中嵌套的可执行文件中,执行以下代码以将调度程序从‘SCHED_OTHER’更改为‘SCHED_FIFO’(请参阅docs):
struct sched_param sched = {};
const int nMin = sched_get_priority_min(SCHED_FIFO);
const int nMax = sched_get_priority_max(SCHED_FIFO);
const int nHlf = (nMax - nMin) / 2;
const int nPriority = nMin + nHlf + 1;
sched.sched_priority = boost::algorithm::clamp(nPriority, nMin, nMax);
if (sched_setscheduler(0, SCHED_FIFO, &sched) < 0)
std::cerr << "SETSCHEDULER failed - err = " << strerror(errno) << std::endl;
else
std::cout << "Priority set to "" << sched.sched_priority << """ << std::endl;
我一直在阅读有关使用实时调度程序的各种Docker文档。一个有趣的page状态,
通过运行zcat/proc/config.gz|grep CONFIG_RT_GROUP_SCHED或检查文件/sys/fs/cgroup/cpu.rt_time_us,验证是否在Linux内核中启用了CONFIG_RT_GROUP_SCHED。有关配置内核实时调度程序的指导,请参阅您的操作系统的文档。
根据前述建议,现货Ubuntu Zsitity 17.04操作系统似乎未能通过这些检查。
第一个问题:我无法使用RT计划程序?什么是‘CONFIG_RT_GROUP_SCHED’?有一件事让我感到困惑,那就是在2010-2012年间,网上有一些关于用RT补丁修补内核的老帖子。从那时起,Linux内核中似乎已经有了一些与软RT相关的工作。
引语here引发了我的问题:
然而,从内核版本2.6.18开始,Linux逐渐配备了实时功能,其中大部分来自Ingo Molnar、Thomas Gleixner、Steven Rostedt等人开发的以前的实时抢占补丁。在补丁完全合并到主线内核中之前(预计在内核版本2.6.30左右),必须安装补丁才能获得最佳的实时性能。这些补丁程序命名为:
继续...
阅读了其他信息后,我注意到设置ulimits非常重要。我更改了/etc/security/limits.conf:
#* soft core 0
#root hard core 100000
#* hard rss 10000
# NEW ADDITION
gavin hard rtprio 99
第二个问题:要使docker守护程序运行RT,可能需要执行上述操作?看起来该后台进程似乎是通过SYSTEM D控制的。
我继续调查,在同一个Docker文档页面上看到了以下代码片段:
要使用实时调度程序运行容器,请运行Docker守护进程,并将--cpu-rt-time标志设置为每个运行时间段为实时任务保留的最大微秒数。例如,在默认周期为10000微秒(1秒)的情况下,设置--cpu-rt-untime=95000可以确保使用实时调度程序的容器每运行10000微秒就可以运行10000微秒,从而为非实时任务留出至少5000微秒的时间。要使此配置在使用SYSTEM D的系统上永久存在,请参阅使用SYSTEM D控制和配置Docker。
在this page之后,我发现守护进程有两个参数值得关注:
--cpu-rt-period int Limit the CPU real-time period in microseconds
--cpu-rt-runtime int Limit the CPU real-time runtime in microseconds
同一页面指出可以通过‘/etc/docker/daemon.json’指定docker守护进程参数,所以我尝试了:
{
"cpu-rt-period": 92500,
"cpu-rt-runtime": 100000
}
注意:文档未将上述选项指定为"Linux上允许的配置选项"。尽管如此,我还是想试一试。
重新启动时Docker后台进程的输出:
-- Logs begin at Wed 2017-10-04 09:58:38 BST, end at Wed 2017-10-04 10:01:32 BST. --
Oct 04 09:58:47 gavin systemd[1]: Starting Docker Application Container Engine...
Oct 04 09:58:47 gavin dockerd[1501]: time="2017-10-04T09:58:47.885882588+01:00" level=info msg="libcontainerd: new containerd process, pid: 1531"
Oct 04 09:58:48 gavin dockerd[1501]: time="2017-10-04T09:58:48.053986072+01:00" level=warning msg="failed to rename /var/lib/docker/tmp for background deletion: %!s(<nil>).
Oct 04 09:58:48 gavin dockerd[1501]: time="2017-10-04T09:58:48.161303803+01:00" level=info msg="[graphdriver] using prior storage driver: aufs"
Oct 04 09:58:48 gavin dockerd[1501]: time="2017-10-04T09:58:48.303409053+01:00" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Oct 04 09:58:48 gavin dockerd[1501]: time="2017-10-04T09:58:48.304002725+01:00" level=warning msg="Your kernel does not support swap memory limit"
Oct 04 09:58:48 gavin dockerd[1501]: time="2017-10-04T09:58:48.304078792+01:00" level=warning msg="Your kernel does not support cgroup rt period"
Oct 04 09:58:48 gavin dockerd[1501]: time="2017-10-04T09:58:48.304201239+01:00" level=warning msg="Your kernel does not support cgroup rt runtime"
Oct 04 09:58:48 gavin dockerd[1501]: time="2017-10-04T09:58:48.305534113+01:00" level=info msg="Loading containers: start."
Oct 04 09:58:48 gavin dockerd[1501]: time="2017-10-04T09:58:48.730193030+01:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemo
Oct 04 09:58:48 gavin dockerd[1501]: time="2017-10-04T09:58:48.784938130+01:00" level=info msg="Loading containers: done."
Oct 04 09:58:48 gavin dockerd[1501]: time="2017-10-04T09:58:48.888035017+01:00" level=info msg="Daemon has completed initialization"
Oct 04 09:58:48 gavin dockerd[1501]: time="2017-10-04T09:58:48.888104120+01:00" level=info msg="Docker daemon" commit=89658be graphdriver=aufs version=17.05.0-ce
Oct 04 09:58:48 gavin dockerd[1501]: time="2017-10-04T09:58:48.903280645+01:00" level=info msg="API listen on /var/run/docker.sock"
Oct 04 09:58:48 gavin systemd[1]: Started Docker Application Container Engine.
感兴趣的特定行:
Oct 04 09:58:48 gavin dockerd[1501]: time="2017-10-04T09:58:48.304078792+01:00" level=warning msg="Your kernel does not support cgroup rt period"
Oct 04 09:58:48 gavin dockerd[1501]: time="2017-10-04T09:58:48.304201239+01:00" level=warning msg="Your kernel does not support cgroup rt runtime"
考虑到我以前的发现,这并不令人惊讶。
最后一个问题:当这最终起作用时,我如何能够确定我的容器真正在使用RT调度运行?像这样的‘top’够了吗?
编辑:我在GitHub上通过Moby运行了内核诊断script。这是输出:
warning: /proc/config.gz does not exist, searching other paths for kernel config ...
info: reading kernel config from /boot/config-4.10.0-35-generic ...
Generally Necessary:
- cgroup hierarchy: properly mounted [/sys/fs/cgroup]
- apparmor: enabled and tools installed
- CONFIG_NAMESPACES: enabled
- CONFIG_NET_NS: enabled
- CONFIG_PID_NS: enabled
- CONFIG_IPC_NS: enabled
- CONFIG_UTS_NS: enabled
- CONFIG_CGROUPS: enabled
- CONFIG_CGROUP_CPUACCT: enabled
- CONFIG_CGROUP_DEVICE: enabled
- CONFIG_CGROUP_FREEZER: enabled
- CONFIG_CGROUP_SCHED: enabled
- CONFIG_CPUSETS: enabled
- CONFIG_MEMCG: enabled
- CONFIG_KEYS: enabled
- CONFIG_VETH: enabled (as module)
- CONFIG_BRIDGE: enabled (as module)
- CONFIG_BRIDGE_NETFILTER: enabled (as module)
- CONFIG_NF_NAT_IPV4: enabled (as module)
- CONFIG_IP_NF_FILTER: enabled (as module)
- CONFIG_IP_NF_TARGET_MASQUERADE: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_ADDRTYPE: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_CONNTRACK: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_IPVS: enabled (as module)
- CONFIG_IP_NF_NAT: enabled (as module)
- CONFIG_NF_NAT: enabled (as module)
- CONFIG_NF_NAT_NEEDED: enabled
- CONFIG_POSIX_MQUEUE: enabled
Optional Features:
- CONFIG_USER_NS: enabled
- CONFIG_SECCOMP: enabled
- CONFIG_CGROUP_PIDS: enabled
- CONFIG_MEMCG_SWAP: enabled
- CONFIG_MEMCG_SWAP_ENABLED: missing
(cgroup swap accounting is currently not enabled, you can enable it by setting boot option "swapaccount=1")
- CONFIG_LEGACY_VSYSCALL_EMULATE: enabled
- CONFIG_BLK_CGROUP: enabled
- CONFIG_BLK_DEV_THROTTLING: enabled
- CONFIG_IOSCHED_CFQ: enabled
- CONFIG_CFQ_GROUP_IOSCHED: enabled
- CONFIG_CGROUP_PERF: enabled
- CONFIG_CGROUP_HUGETLB: enabled
- CONFIG_NET_CLS_CGROUP: enabled (as module)
- CONFIG_CGROUP_NET_PRIO: enabled
- CONFIG_CFS_BANDWIDTH: enabled
- CONFIG_FAIR_GROUP_SCHED: enabled
- CONFIG_RT_GROUP_SCHED: missing
- CONFIG_IP_VS: enabled (as module)
- CONFIG_IP_VS_NFCT: enabled
- CONFIG_IP_VS_RR: enabled (as module)
- CONFIG_EXT4_FS: enabled
- CONFIG_EXT4_FS_POSIX_ACL: enabled
- CONFIG_EXT4_FS_SECURITY: enabled
- Network Drivers:
- "overlay":
- CONFIG_VXLAN: enabled (as module)
Optional (for encrypted networks):
- CONFIG_CRYPTO: enabled
- CONFIG_CRYPTO_AEAD: enabled
- CONFIG_CRYPTO_GCM: enabled (as module)
- CONFIG_CRYPTO_SEQIV: enabled
- CONFIG_CRYPTO_GHASH: enabled (as module)
- CONFIG_XFRM: enabled
- CONFIG_XFRM_USER: enabled (as module)
- CONFIG_XFRM_ALGO: enabled (as module)
- CONFIG_INET_ESP: enabled (as module)
- CONFIG_INET_XFRM_MODE_TRANSPORT: enabled (as module)
- "ipvlan":
- CONFIG_IPVLAN: enabled (as module)
- "macvlan":
- CONFIG_MACVLAN: enabled (as module)
- CONFIG_DUMMY: enabled (as module)
- "ftp,tftp client in container":
- CONFIG_NF_NAT_FTP: enabled (as module)
- CONFIG_NF_CONNTRACK_FTP: enabled (as module)
- CONFIG_NF_NAT_TFTP: enabled (as module)
- CONFIG_NF_CONNTRACK_TFTP: enabled (as module)
- Storage Drivers:
- "aufs":
- CONFIG_AUFS_FS: enabled (as module)
- "btrfs":
- CONFIG_BTRFS_FS: enabled (as module)
- CONFIG_BTRFS_FS_POSIX_ACL: enabled
- "devicemapper":
- CONFIG_BLK_DEV_DM: enabled
- CONFIG_DM_THIN_PROVISIONING: enabled (as module)
- "overlay":
- CONFIG_OVERLAY_FS: enabled (as module)
- "zfs":
- /dev/zfs: missing
- zfs command: missing
- zpool command: missing
Limits:
- /proc/sys/kernel/keys/root_maxkeys: 1000000
重要行:
- CONFIG_RT_GROUP_SCHED: missing
推荐答案
在容器中执行RT计划有两个选项:
添加Systems_NICE功能
码头运行--CAP-ADD SYSS_NICE...
使用带有特权标志的特权模式
码头运行--特权...
据说特权模式不安全,因此选项1最好只添加您需要的功能。
如果您以根用户身份运行(Docker容器的默认身份),则可能还必须在sysctl中启用实时计划:
sysctl -w kernel.sched_rt_runtime_us=-1
要使其永久化(更新您的形象):
echo 'kernel.sched_rt_runtime_us=-1' > /etc/sysctl.conf
https://docs.docker.com/engine/reference/run/#runtime-privilege-and-linux-capabilities
这篇关于Docker守护进程/容器与Ubuntu(Linux)主机的实时调度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!