英特尔的 CLWB 指令使缓存行无效 [英] Intel's CLWB instruction invalidating cache lines

查看:24
本文介绍了英特尔的 CLWB 指令使缓存行无效的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为 Intel 的 clwb 指令找到不会使缓存行无效的配置或内存访问模式.我正在使用 NVDIMM 对 Intel Xeon Gold 5218 处理器进行测试.Linux 版本是 5.4.0-3-amd64.我尝试使用 Device-DAX 模式并直接将此字符设备映射到地址空间.我还尝试将此非易失性内存添加为新的 NUMA 节点,并使用 numactl --membind 命令将内存绑定到它.在这两种情况下,当我使用 clwb 缓存地址时,它会被驱逐.我正在观察 PAPI 硬件计数器的驱逐,并禁用预取器.

I am trying to find configuration or memory access pattern for Intel's clwb instruction that would not invalidate cache line. I am testing on Intel Xeon Gold 5218 processor with NVDIMMs. Linux version is 5.4.0-3-amd64. I tried using Device−DAX mode and directly mapping this char device to the address space. I also tried adding this non-volatile memory as a new NUMA node and using numactl --membind command to bind memory to it. In both cases when I use clwb to cached address, it is evicted. I am observing eviction with PAPI hardware counters, with disabled prefetchers.

这是我正在测试的一个简单循环.array 和 tmp 变量,都声明为 volatile,所以加载是真正执行的.

This is a simple loop that I am testing. array and tmp variable, both are declared as volatile, so the loads are really executed.

for(int i=0; i < arr_size; i++){
    tmp = array[i];
    _mm_clwb(& array[i]);
    _mm_mfence();
    tmp = array[i];    
}

两次读取都会导致缓存未命中.

Both reads are giving cache misses.

我想知道是否有其他人试图检测是否有某种配置或内存访问模式会在缓存中留下缓存行?

I was wondering if anyone else has tried to detect whether there is some configuration or memory access pattern that would leave the cache line in the cache?

推荐答案

clwb 在 SKX 和 CSL 上的行为类似于 clflushopt.但是,在这些处理器上使用 clwb 的程序在支持 clwb 优化实现的未来进程上运行时将自动受益.

clwb behaves like clflushopt on SKX and CSL. However, programs that use clwb on these processors will automatically benefit when run on a future process that supports an optimized implementation of clwb.

clwb 保留 ICL 上的缓存行.

clwb retains the cache line on ICL.

注意 cpuid 来自 InstLatx64 说 ICL 不支持 clwb,这是不正确的.

Note that cpuid leaf 0x7 information from InstLatx64 says that ICL doesn't support clwb, which is incorrect.

clwb 也支持 Zen 2,但我不知道它在这个微架构上是如何工作的.

clwb is also supported on Zen 2, but I don't know how it works on this microarchitecture.

这篇关于英特尔的 CLWB 指令使缓存行无效的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆