在x86中刷新iCache [英] Flush iCache in x86

查看:89
本文介绍了在x86中刷新iCache的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

反正我可以在x86体系结构中刷新iCache吗?就像WBINVD一样,它将使数据缓存中的所有缓存行失效并刷新.

Is there anyway I can flush iCache in x86 architecture ? Like WBINVD which will invalidate and flush all the cachelines in data cache.

推荐答案

根据文档, wbinvd 刷新所有缓存并使它们无效,而不仅仅是数据缓存和统一缓存.(如果您启用了分页功能,则无法确定其中是否包含TLB.)

According to the docs, wbinvd flushes and invalidates all caches, not just data and unified caches. (I'm not sure if that includes TLBs if you ran it with paging enabled.)

您要测试什么?L1i miss/L2 hit for code-fetch吗?我认为,有可能在不刷新所有级别的缓存的情况下故意刷新 just I-cache.

What are you trying to test? L1i miss / L2 hit for code-fetch? I don't think it's possible to purposely flush just the I-cache without also flushing all levels of cache.

您可以通过在8个地址处执行代码别名来创建特定行的冲突未命中,并假设使用8路32kiB L1i高速缓存.但是缓存替换通常是伪LRU,而不是真正的LRU,因此您可能需要几次跳过一组超过8条别名行以确保操作.

You could create conflict misses for a specific line by executing code at 8 addresses that alias it, assuming an 8-way 32kiB L1i cache. But cache replacement is usually pseudo-LRU, not true LRU, so you might want to jump through a set of more than 8 aliasing lines a couple times to make sure.

clflush / clflushopt 应该为特定的缓存行解决问题.他们需要从所有内核的 all 缓存级别中清除该行.

clflush / clflushopt should do the trick for a specific cache line. They're required to flush the line from all levels of cache in all cores.

我认为它们也会从(虚拟寻址的)uop缓存中逐出已解码的uops.

I assume they would also evict decoded uops from the (virtually addressed) uop cache.

CLFLUSH指令可以在所有特权级别使用,并且受制于所有权限检查和与字节加载相关的错误(此外,此外,CLFLUSH指令允许在仅执行中刷新线性地址细分).像加载一样,CLFLUSH指令将页表中的A位置位,但不将D位置.

The CLFLUSH instruction can be used at all privilege levels and is subject to all permission checking and faults associated with a byte load (and in addition, a CLFLUSH instruction is allowed to flush a linear address in an execute-only segment). Like a load, the CLFLUSH instruction sets the A bit but not the D bit in the page tables.


但是,如果您希望在进行JIT编译后保持这种正确性,只需跳转或调用新编写的指令就足以避免获取过时的指令.


But if you want this correctness after JIT-compiling something, merely jumping or calling to the newly-written instructions is sufficient to avoid stale instruction fetch.

(实际上,在当前的x86实现中,它们监听存储到管道中的任何代码地址,因此即使将相同的物理页面映射到不同的虚拟地址,也永远不会看到过时的指令提取,并且使用自修改代码观察在x86上获取的过时指令)

(In fact, on current x86 implementations, they snoop stores to any code address in the pipeline, so you'll never see stale instruction fetch even when you have the same physical page mapped to different virtual addresses, and write one while executing the other. Observing stale instruction fetching on x86 with self-modifying code)

您只需要担心编译器将死存储"优化为转换为函数指针的缓冲区.在GNU C/C ++中,在您编写的字节范围内使用 __ builtin ___ clear_cache .它在x86上编译为零指令(不同于ARM或其他具有非一致性指令缓存的ISA),但是仍然需要不优化指令字节的存储:

You only need to worry about your compiler optimizing away "dead stores" to a buffer you cast to a function pointer. In GNU C / C++, use __builtin___clear_cache on the range of bytes you wrote. It compiles to zero instructions on x86 (unlike ARM or other ISAs with non-coherent instruction caches), but it is still needed to not optimize away stores of instruction bytes: How does __builtin___clear_cache work?

这篇关于在x86中刷新iCache的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆