Linux 是否将 x86 CPU 的 PCID 功能用于 TLB?如果不是,为什么? [英] Does Linux use x86 CPU's PCID feature for TLB? If not, why?

查看:26
本文介绍了Linux 是否将 x86 CPU 的 PCID 功能用于 TLB?如果不是,为什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个内核模块来检查 CR4.PCIDE,它没有设置.为什么 Linux 不使用这样的特性来减少由于 TLB 失效和缓存污染导致的性能下降?

I wrote a kernel module to check CR4.PCIDE, it is not set. Why doesn't Linux use such feature to reduce the performance slowdown due to TLB invalidation and cache pollution?

推荐答案

更新:由于 Meltdown 和 Spectre 攻击,这在 4.15 时间范围内发生了变化2017 年末和 2018 年初.有关详细信息,请参阅其他答案.

Update: This changed around the 4.15 timeframe due to the Meltdown and Spectre attacks in late 2017 and early 2018. See the other answer for details.

注意:我不是 Linux 开发人员

Note: I'm not a Linux developer

对于英特尔的进程上下文标识符",有 4096 个 ID 的限制.这意味着当有超过 4096 个进程时,您需要管理它们(例如,可能会做一个最近最少使用"的事情,以便如果需要执行当前没有 ID 的进程,那么 ID 是从一些其他过程并重复使用).

For Intel's "Process Context Identifiers", there's a limit of 4096 IDs. This means that when there are more than 4096 processes you need to manage them (e.g. maybe do a "least recently used" thing so that if a process that currently doesn't have an ID needs to be executed then the ID is taken from some other process and reused).

涉及到的另一件事是多 CPU 系统上的TLB 击落".这些可能有点贵,所以人们会采取一些技巧来避免它们.例如,如果一个进程只有一个线程,那么它只能在一个 CPU 上运行,并且您知道没有必要向其他 CPU 发送 IPI(中断它们并要求它们执行TLB 击落").一旦开始使用 PCID,您就无法确定其他 CPU 是否仍然没有 TLB 条目,并且无法使用这些技巧来避免TLB 击落".这也意味着(理论上,对于实施不当的 PCID 支持)您从 PCID 获得的性能可能低于由于不可避免的 TLB 击落和 ID 管理开销而损失的性能,从而导致净损失.

The other thing that comes into it is "TLB shootdown" on multi-CPU systems. These can be a little expensive, so people do tricks to avoid them. For example, if a process only has one thread then it can only be running on one CPU and you know there's no need to send an IPI to other CPUs (interrupting them and asking them to do the "TLB shootdown"). Once you start using PCIDs you can't be sure that other CPUs don't still have TLB entries, and can't do these tricks to avoid "TLB shootdown". It also means that (in theory, for badly implemented PCID support) the performance you gain from PCID may be less than the performance you lose due to unavoided TLB shootdown and ID management overhead, resulting in a net loss.

我要说的主要是添加对 PCID 的支持有点复杂(这不像您可以在 CR4 中设置一个标志而忘记它).您必须进行一些研究(实验、原型、基准测试)以确定最有效的实施方式.对于大型/复杂/旧内核(如 Linux),它会更加复杂,因为您必须小心不要意外破坏其他东西.另一件事是这个功能相对较新(如果我没记错的话,它只存在了几年)并且很多 CPU 都不支持(例如,任何更旧的 CPU,以及来自 AMD 的任何东西).

Mostly what I'm saying is that it's a little complicated to add support for PCID (it's not like you can just set a flag in CR4 and forget about it). You'd have to do some research (experiments, prototypes, benchmarking) to determine the most effective way of implementing it. For a large/complex/old kernel (like Linux) it'd be even more complicated as you'd have to be careful not to upset something else by accident. The other thing is that this feature is relatively new (it's only existed for a few years if I remember correctly) and isn't supported by a lot of CPUs (e.g. anything a little older, and anything from AMD).

基本上,我认为它归结为时间与收益"(或者,没有足够的时间在有限数量的 CPU 上进行小的性能改进).

Basically, I'd assume that it comes down to "time vs. benefits" (or, not enough time for a small performance improvement on a limited number of CPUs).

这篇关于Linux 是否将 x86 CPU 的 PCID 功能用于 TLB?如果不是,为什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆