clflush是否还会删除TLB条目? [英] Does clflush also remove TLB entries?

查看:94
本文介绍了clflush是否还会删除TLB条目?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

clflush 1 还会刷新关联的TLB条目吗?由于 clflush 以高速缓存行的粒度运行,而TLB条目以(更大)的页面粒度存在,所以我不认为这是可行的,但我准备感到惊讶。 p>




1 ...或 clflushopt

解决方案

我认为假定不是安全的;将 invlpg 烘焙到 clflush 中听起来像是一个疯狂的设计决定,我认为没人会做出。您通常希望使页面中的多行无效。也没有明显的好处;



即使只是删除最终的TLB条目(也不一定会使任何页面目录缓存无效) )会比 invlpg 弱,但仍然没有道理。



所有现代x86都使用带有物理索引/标记的缓存,不是虚拟的。 (VIPT L1d缓存实际上是具有索引的自由转换的PIPT,因为它是从页面内偏移量的一部分的地址位中获取的。)即使缓存是虚拟的,使TLB条目无效也需要使虚拟缓存无效,而不是相反。






根据IACA, clflush 在HSW-SKL和NHM-IVB上的4微克(包括微融合)。因此,它甚至还没有在Intel上进行微编码。



IACA不会对 invlpg 进行建模,但我想它会更多哎呀。 (而且它很荣幸,因此测试起来并非易事。)极有可能在HSW之前的那些额外操作会导致TLB失效。



我没有任何信息






invlpg 享有特权的事实是另一个事实期望 clflush 不是它的超集的原因。 clflush 没有特权。大概是出于性能原因, invlpg 仅限于环0。



但是 invlpg 不会出现页面错误,因此用户空间可以使用它来使内核TLB条目无效,从而延迟实时进程和中断处理程序。 ( wbinvd 被授予特权的原因类似:它非常慢,我认为不可中断。) clflush 确实在非法地址上出错因此它不会打开该拒绝服务漏洞。不过,您可以 clflush 共享的VDSO页面。



除非有某些原因导致CPU 想要暴露用户空间中的 invlpg (通过将其烘焙到 clflush ),我真的不没看到任何供应商会这样做的原因。






在未来的计算中使用非易失性DIMM的可能性更低将来的任何CPU都将使其超慢地循环执行 clflush 的一系列内存。您可能希望大多数使用内存映射NV存储的软件都使用 clflushopt ,但是我希望CPU供应商能够制作 clflush 也要尽快。


Does clflush1 also flush associated TLB entries? I would assume not since clflush operates at a cache-line granularity, while TLB entries exist at the (much larger) page granularity - but I am prepared to be suprised.


1 ... or clflushopt although one would reasonably assume their behaviors are the same.

解决方案

I think it's safe to assume no; baking invlpg into clflush sounds like an insane design decision that I don't think anyone would make. You often want to invalidate multiple lines in a page. There's also no apparent benefit; flushing the TLB as well doesn't make it any easier to implement data-cache flushing.

Even just dropping the final TLB entry (without necessarily invalidating any page-directory caching) would be weaker than invlpg but still not make sense.

All modern x86s use caches with physical indexing/tagging, not virtual. (VIPT L1d caches are really PIPT with free translation of the index because it's taken from address bits that are part of the offset within a page.) And even if caches were virtual, invalidating TLB entries requires invaliding virtual caches but not the other way around.


According to IACA, clflush is only 2 uops on HSW-SKL, and 4 uops (including micro-fusion) on NHM-IVB. So it's not even micro-coded on Intel.

IACA doesn't model invlpg, but I assume it's more uops. (And it's privileged so it's not totally trivial to test.) It's remotely possible those extra uops on pre-HSW were for TLB invalidation.

I don't have any info on AMD.


The fact that invlpg is privileged is another reason to expect clflush not to be a superset of it. clflush is unprivileged. Presumably it's only for performance reasons that invlpg is restricted to ring 0 only.

But invlpg won't page-fault, so user-space could use it to invalidate kernel TLB entries, delaying real-time processes and interrupt handlers. (wbinvd is privileged for similar reasons: it's very slow and I think not interruptible.) clflush does fault on illegal addresses so it wouldn't open up that denial-of-service vulnerability. You could clflush the shared VDSO page, though.

Unless there's some reason why a CPU would want to expose invlpg in user-space (by baking it in to clflush), I really don't see why any vendor would do it.


With non-volatile DIMMs in the future of computing, it's even less likely that any future CPUs will make it super-slow to loop over a range of memory doing clflush. You'd expect most software using memory mapped NV storage to be using clflushopt, but I'd expect CPU vendors to make clflush as fast as possible, too.

这篇关于clflush是否还会删除TLB条目?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆