实时 Linux:禁用本地定时器中断 [英] Real time Linux: disable local timer interrupts

查看:25
本文介绍了实时 Linux:禁用本地定时器中断的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

TL;DR:实时使用带有 NO_HZ_FULL 的 Linux 内核我需要隔离进程以获得确定性结果,但/proc/interrupts 告诉我仍然存在本地定时器中断(等等).如何禁用它?

长版:

我想确保我的程序没有被中断,所以我尝试使用实时 Linux 内核.我正在使用 arch Linux 的实时版本(AUR 上的 linux-rt)并且我修改了内核的配置以选择以下选项:

CONFIG_NO_HZ_FULL=yCONFIG_NO_HZ_FULL_ALL=yCONFIG_RCU_NOCB_CPU=yCONFIG_RCU_NOCB_CPU_ALL=y

然后我重新启动我的计算机以使用以下选项在这个实时内核上启动:

nmi_watchdog=0rcu_nocbs=1nohz_full=1isolcpus=1

我还在 BIOS 中禁用了以下选项:

C 状态英特尔速度步骤涡轮模式VTxVTd超线程

我的 CPU (i7-6700 3.40GHz) 有 4 个内核(8 个具有超线程技术的逻辑 CPU)我可以在/proc/interrupts 文件中看到 CPU0、CPU1、CPU2、CPU3.

CPU1 由 isolcpus 内核参数隔离,我想禁用此 CPU 上的本地定时器中断.我虽然具有 CONFIG_NO_HZ_FULL 和 CPU 隔离(isolcpus)的实时内核足以做到这一点,我尝试通过运行这些命令来检查:

cat/proc/interrupts |grep LOC >~/tmp/log/overload_cpu1任务集 -c 1 ./overloadcat/proc/interrupts |grep LOC >>~/tmp/log/overload_cpu1

重载过程在哪里:

***overload.c:***int main(){for(int i=0;i<100;++i)for(int j=0;j<100000000;++j);}

文件 overload_cpu1 包含结果:

LOC: 234328 488 12091 11299 本地定时器中断LOC:239072 651 12215 11323 本地定时器中断

含义 651-488 = 163 个来自本地定时器的中断而不是 0...

为了比较,我做了同样的实验,但我改变了我的进程 overload 运行的核心(我一直在观察 CPU1 上的中断):

taskset -c 0 : 8 个中断任务集 -c 1:163 个中断任务集 -c 2 : 7 个中断任务集 -c 3 : 8 个中断

我的问题之一是为什么没有 0 中断?为什么当我的进程在 CPU1 上运行时中断的数量更大?(我的意思是,如果我的进程是单独的,我虽然 NO_HZ_FULL 会阻止中断:CONFIG_NO_HZ_FULL=y Kconfig 选项会导致内核避免使用单个可运行任务向 CPU 发送调度时钟中断"(https://www.kernel.org/doc/Documentation/timers/NO_HZ.txt)

也许解释是有其他进程在 CPU1 上运行.我使用 ps 命令进行了检查:

CLS CPUID RTPRIO PRI NI CMD PIDTS 1 - 19 0 [cpuhp/1] 18FF 1 99 139 - [迁移/1] 20TS 1 - 19 0 [rcuc/1] 21FF 1 1 41 - [ktimersoftd/1] 22TS 1 - 19 0 [ksoftirqd/1] 23TS 1 - 19 0 [kworker/1:0] 24TS 1 - 39 -20 [kworker/1:0H] 25FF 1 1 41 - [posixcputmr/1] 28TS 1 - 19 0 [kworker/1:1] 247TS 1 - 39 -20 [kworker/1:1H] 501

如您所见,CPU1 上有线程.可以禁用这些进程吗?我猜是因为如果不是这样,NO_HZ_FULL 将永远无法正常工作?

TS 类的任务不会打扰我,因为它们在 SCHED_FIFO 中没有优先级,我可以将此策略设置为我的程序.类 FF 和优先级小于 99 的任务也是如此.

但是,您可以看到 SCHED_FIFO 中的 migration/1 和优先级 99.也许这些进程在运行时会导致中断.这解释了当我的进程在 CPU0、CPU2 和 CPU3(分别是 8,7 和 8 个中断)上时的少数中断,但这也意味着这些进程不经常运行,然后没有解释为什么我的进程运行时有很多中断在 CPU1 上(163 个中断).

我也做了同样的实验,但使用了我的过载过程的 SCHED_FIFO,我得到:

taskset -c 0 : 1任务集-c 1:4063任务集 -c 2 : 1任务集-c 3:0

在此配置中,如果我的进程在 CPU1 上使用 SCHED_FIFO 策略,而在其他 CPU 上使用更少的中断,则会有更多的中断.你知道为什么吗?

解决方案

问题是完全无滴答的 CPU(又名自适应滴答,配置为 nohz_full=)仍然会收到一些滴答.

最值得注意的是,调度程序需要一个独立的完整无滴答 CPU 上的计时器,以每秒左右更新一些状态.

这是记录在案的限制(截至 2019 年):

<块引用>

一些进程处理操作仍然需要偶尔的调度时钟滴答声.这些操作包括计算 CPU负载,维持调度平均,计算 CFS 实体 vruntime,计算avenrun,并进行负载均衡.他们是当前由调度时钟滴答每秒容纳或者.正在进行的工作将消除对这些的需求不频繁的调度时钟滴答声.

(来源:Documentation/timers/NO_HZ.txt,参见 LWN 文章 (几乎)2013 年 3.10 中的完全无滴答操作 一些背景)>

测量本地定时器中断(/proc/interrupts 中的 LOC 行)的更准确方法是使用 perf.例如:

$ perf stat -a -A -e irq_vectors:local_timer_entry ./my_binary

其中 my_binary 将线程固定到隔离的 CPU,这些 CPU 不间断地使用 CPU,而不会调用系统调用 - 例如 - 2 分钟.

还有其他本地计时器滴答的其他来源(当只有 1 个可运行任务时).

例如,VM 统计信息的收集 - 默认情况下,它们每秒钟收集一次.因此,我可以通过设置更高的值来减少 LOC 中断,例如:

# sysctl vm.stat_interval=60

另一个来源是定期检查不同 CPU 上的 TSC 是否不漂移 - 您可以使用以下内核选项禁用这些:

tsc=可靠

(仅当您确实知道您的 TSC 不会漂移时才应用此选项.)

您可能会通过使用 ftrace (当您的测试二进制文件正在运行时).

因为它出现在评论中:是的,SMI 对内核是完全透明的.它不会显示为 NMI.您只能间接检测 SMI.

TL;DR : Using Linux kernel real time with NO_HZ_FULL I need to isolate a process in order to have deterministic results but /proc/interrupts tell me there is still local timer interrupts (among other). How to disable it?

Long version :

I want to make sure my program is not being interrupt so I try to use a real time Linux kernel. I'm using the real time version of arch Linux (linux-rt on AUR) and I modified the configuration of the kernel to selection the following options :

CONFIG_NO_HZ_FULL=y
CONFIG_NO_HZ_FULL_ALL=y
CONFIG_RCU_NOCB_CPU=y
CONFIG_RCU_NOCB_CPU_ALL=y

then I reboot my computer to boot on this real time kernel with the folowing options:

nmi_watchdog=0
rcu_nocbs=1
nohz_full=1
isolcpus=1

I also disable the following option in the BIOS :

C state
intel speed step
turbo mode
VTx
VTd
hyperthreading

My CPU (i7-6700 3.40GHz) has 4 cores (8 logical CPU with hyperthreading technology) I can see CPU0, CPU1, CPU2, CPU3 in /proc/interrupts file.

CPU1 is isolated by isolcpus kernel parameter and I want to disable the local timer interrupts on this CPU. I though real-time kernel with CONFIG_NO_HZ_FULL and CPU isolation (isolcpus) was enough to do it and I try to check by running theses command :

cat /proc/interrupts | grep LOC > ~/tmp/log/overload_cpu1
taskset -c 1 ./overload
cat /proc/interrupts | grep LOC >> ~/tmp/log/overload_cpu1

where the overload process is:

***overload.c:***
int main()
{
  for(int i=0;i<100;++i)
    for(int j=0;j<100000000;++j);
}

The file overload_cpu1 contains the result:

LOC:     234328        488      12091      11299   Local timer interrupts
LOC:     239072        651      12215      11323   Local timer interrupts

meanings 651-488 = 163 interrupts from local timer and not 0...

For comparison I do the same experiment but I change the core where my process overload run (I keep watching interrupts on CPU1):

taskset -c 0 :   8 interrupts
taskset -c 1 : 163 interrupts
taskset -c 2 :   7 interrupts
taskset -c 3 :   8 interrupts

One of my question is why there is no 0 interrupts ? why the number of interrupts is bigger when my process run on CPU1 ? (I mean I though NO_HZ_FULL will prevent interrupt if my process was alone : "The CONFIG_NO_HZ_FULL=y Kconfig option causes the kernel to avoid sending scheduling-clock interrupts to CPUs with a single runnable task"(https://www.kernel.org/doc/Documentation/timers/NO_HZ.txt)

Maybe an explaination is there is other process running on CPU1. I checked by using ps command :

CLS CPUID RTPRIO PRI  NI CMD                           PID
TS      1      -  19   0 [cpuhp/1]                      18
FF      1     99 139   - [migration/1]                  20
TS      1      -  19   0 [rcuc/1]                       21
FF      1      1  41   - [ktimersoftd/1]                22
TS      1      -  19   0 [ksoftirqd/1]                  23
TS      1      -  19   0 [kworker/1:0]                  24
TS      1      -  39 -20 [kworker/1:0H]                 25
FF      1      1  41   - [posixcputmr/1]                28
TS      1      -  19   0 [kworker/1:1]                 247
TS      1      -  39 -20 [kworker/1:1H]                501

As you can see, there is threads on the CPU1. Is that possible to disable these processes ? I guess it is because if it is not the case, NO_HZ_FULL will never work right ?

Tasks with class TS doesn't disturb me because they didn't have priority among SCHED_FIFO and I can set this policy to my program. Same things for tasks with class FF and priority less than 99.

However, you can see migration/1 that is in SCHED_FIFO and priority 99. Maybe these process can causes interrupts when they run . This explain the few interrupts when my process in on CPU0, CPU2 and CPU3 (respectively 8,7 and 8 interrupts) but it also mean these processes are not running very often and then doesn't explain why there is many interrupts when my process run on CPU1 (163 interrupts).

I also do the same experiment but with the SCHED_FIFO of my overload process and I get:

taskset -c 0 : 1
taskset -c 1 : 4063
taskset -c 2 : 1
taskset -c 3 : 0

In this configuration there is more interrupts in the case my process use SCHED_FIFO policy on CPU1 and less on other CPU. do you know why ?

解决方案

The thing is that a full-tickless CPU (a.k.a. adaptive-ticks, configured with nohz_full=) still receives some ticks.

Most notably the scheduler requires a timer on an isolated full tickless CPU for updating some state every second or so.

This is a documented limitation (as of 2019):

Some process-handling operations still require the occasional scheduling-clock tick. These operations include calculating CPU load, maintaining sched average, computing CFS entity vruntime, computing avenrun, and carrying out load balancing. They are currently accommodated by scheduling-clock tick every second or so. On-going work will eliminate the need even for these infrequent scheduling-clock ticks.

(source: Documentation/timers/NO_HZ.txt, cf. the LWN article (Nearly) full tickless operation in 3.10 from 2013 for some background)

A more accurate method to measure the local timer interrupts (LOC row in /proc/interrupts) is to use perf. For example:

$ perf stat -a -A -e irq_vectors:local_timer_entry ./my_binary

Where my_binary has threads pinned to the isolated CPUs that non-stop utilize the CPU without invoking syscalls - for - say 2 minutes.

There are other sources of additional local timer ticks (when there is just 1 runnable task).

For example, the collection of VM stats - by default they are collected each seconds. Thus, I can decrease my LOC interrupts by setting a higher value, e.g.:

# sysctl vm.stat_interval=60

Another source are periodic checks if the TSC on the different CPUs doesn't drift - you can disable those with the following kernel option:

tsc=reliable

(Only apply this option if you really know that your TSCs don't drift.)

You might find other sources by recording traces with ftrace (while your test binary is running).

Since it came up in the comments: Yes, the SMI is fully transparent to the kernel. It doesn't show up as NMI. You can only detect an SMI indirectly.

这篇关于实时 Linux:禁用本地定时器中断的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆