clwb+sfence,如果写入是缓存行对齐的,我们可以删除 sfence 吗? [英] clwb+sfence, can we remove sfence if writes are cache-line aligned?

查看:27
本文介绍了clwb+sfence,如果写入是缓存行对齐的,我们可以删除 sfence 吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据 clwb 订购信息(link),

As per information on clwb ordering (link),

"CLWB 指令仅由存储防护操作排序.例如,软件可以使用 SFENCE、MFENCE、XCHG 或 LOCK 前缀指令来确保以前的存储包含在回写中.CLWB 指令不需要由另一个 CLWB 或 CLFLUSHOPT 指令命令.CLWB 与由逻辑处理器执行的旧存储隐式排序到相同的地址."

"CLWB instruction is ordered only by store-fencing operations. For example, software can use an SFENCE, MFENCE, XCHG, or LOCK-prefixed instructions to ensure that previous stores are included in the write-back. CLWB instruction need not be ordered by another CLWB or CLFLUSHOPT instruction. CLWB is implicitly ordered with older stores executed by the logical processor to the same address."

如果 Intel X86-64 上的操作集如下,我可以删除围栏"吗?并且仍然确保正确性如果写入 (A) 和写入 (B) 是缓存行对齐的.

If the set of operations on an Intel X86-64 is as follows, Can I remove the "sfence" and still ensure correctness if the writes (A) and write(B) are cache-line aligned.

我问这个是因为在 Intel Write(A) 和 write(B) 是有序的 (TSO) 和 write(A)->clwb(A)write(B)->clwb(B) 按照上面引用的 clwb

I am asking this since on Intel Write(A) and write(B) are ordered (TSO) and write(A)->clwb(A) and write(B)->clwb(B) are ordered as per above quoted description of clwb

write(A)
clwb(A)
sfence()
write(B)
clwb(B)

我做出以下假设

  1. 编译器不会对这些操作重新排序
  2. clwb() 指令将脏行回写到持久域,所以 write(A)->clwb(A) 对确保 A 的修改值在持久域中立>
  1. compiler does not reorder these operations
  2. clwb() instruction writes back the dirty line till the persistent domain, so write(A)->clwb(A) pair ensures that the modified value of A is in persistent domain

请说明删除围栏是否会破坏正确性?如果是,在什么情况下谢谢

Please tell if removing sfence can break the correctness ? if yes , on what scenarios Thanks

推荐答案

对于位于 相同 缓存行内的 WB 内存的正常存储:是的持久性顺序匹配 x86-TSO 全局可观察性顺序,请参阅系统崩溃时 clflush 或 clflushopt 是原子的吗?.否则无法保证.

For normal stores to WB memory that are both within the same cache line: yes persistence order matches x86-TSO global-observability order, see Is clflush or clflushopt atomic when system crash?. Otherwise that's not guaranteed.

您的意思似乎是 A 完全包含在一个缓存行中,而 B 完全包含在一个单独的缓存行中.

It seems you mean A is fully contained within one cache line, and B within a separate one.

如果没有 SFENCE,发生崩溃后可能看到 B 的效果,但看不到 A.clwb 不是有序的,所以后面的可以先存储持久化.这就是手册所指出的 clwb 没有订购 wrt.普通商店.

Without SFENCE, after a crash it would be possible to see the effect of B but not A. clwb isn't ordered, so the later one could make its store persistent first. That's what the manual is pointing out with clwb's lack of ordering wrt. normal stores.

因此根据 TSO write(B) 发生意味着 write(A) 发生(可能是在存储缓冲区中).

So according to TSO write(B) happened means write(A) happened (may be it is in store buffer).

不,x86-TSO 排序是关于从存储缓冲区到 L1d(全局可观察性指针)的提交顺序.这当然与最终回写(通过驱逐或 clwb)到 DRAM 完全分开.存储 uops 可以以任何顺序执行(将它们的地址+数据写入存储缓冲区),但直到退休后才能提交(即当它们非推测性).此外,该提交仅限于按程序顺序发生,即在发布/重命名/分配期间分配存储缓冲区条目的顺序.

No, x86-TSO ordering is about order of commit from store buffer to L1d, the pointer of global observability. That's of course totally separate from eventual write-back (via eviction or clwb) to DRAM. Store uops can execute (write their address+data to the store buffer) in any order, but can't commit until after retirement (i.e. when they're non-speculative). Additionally, that commit is restricted to happen in program order, i.e. the order store-buffer entries were allocated in during issue/rename/allocate.

意思是 write(A)->write(B) 是有序的,而 write(B)->clwb(B) 是有序的,那么 clwb(B) 如何绕过 write(B) [从而违反顺序约束手册] 并发生在 clwb(A) 之前,从而导致 clwb(B) 的影响在崩溃后可见,而不是 clwb(A)?

meaning write(A)->write(B) are ordered and write(B)->clwb(B) are ordered, so how can clwb(B) bypass write(B) [thus violating the order constrain of manual] and happen before clwb(A) , thus causing effect of clwb(B) visible after a crash and not clwb(A)?

不,与较旧的商店隐式订购......到同一地址";规则仅保证 store + clwb 到同一地址将写回包含该 store-data 的行的版本.否则,当最新的存储仍在存储缓冲区中或什至没有执行时,它可能会写回该行的副本.这并不意味着整个回写必须完成,然后再存储!

No, the "implicitly ordered with older stores ... to the same address" rule only guarantees that store + clwb to the same address will write-back a version of the line that includes that store-data. Otherwise it could write-back a copy of the line while the latest store was still in the store buffer or not even executed. It doesn't mean that the whole write-back has to finish before any later stores!

崩溃后产生 B 但不可见 A 的操作顺序如下:

The order of operations that produces B but not A visible after a crash is the following:

  • A 和 B 以某种顺序执行
  • 一旦 A 和 B 拥有各自线路的 MESI 独占所有权,A 和 B 就会提交到 L1d 缓存,从而对其他核心全局可见.
  • clwb 指令在某个时刻执行,请求在存储提交后的某个时刻将缓存行写回 DRAM.
  • A 行的回写在它提交到 L1d 之后的某个时间点开始,B 行也是如此.它们可以以任一顺序开始,因为 clwb 的顺序不能保证写入.其他行的其他 clwb 操作,尽管实际上它们可能开始在程序中.
  • clwb-B 完成持久化
  • 机器在飞行中的 clwb-A 进入持久域之前断电.您没有要求订购 clwb 操作.彼此,所以这是允许的.

就asm指令重排序而言,允许如下重排序:

In terms of asm instruction reordering, the following reordering is allowed:

 store A
 store B
 clwb  B
 clwb  A     ; not ordered wrt. store B or clwb B

当然,至少在理论上,执行顺序与到达存储缓冲区的末尾与实际的持久提交都是不同的事情,但是如果您想将其简化为一条指令的所有步骤发生在另一个指令的任何影响之前指令,这种重新排序仍然与所有规则兼容.

Of course order of execution vs. reaching the end of the store buffer vs. actual persistent commit are all separate things at least in theory, but if you want to simplify it to all steps of an instruction happening before any effects of another instruction, this reordering is still compatible with all the rules.

我认为您缺少的关键是 clwb A 是独立于商店 A 的操作,它不会一直坚持下去.允许 clwb 发生"在其他后来的商店之后.商店 B 位于不同的地址,因此它不会订购 clwb A.

I think the key thing you're missing is that clwb A is a separate operation from store A, it doesn't stay stuck to it. That clwb is allowed to "happen" after other later stores. store B is to a different address, so it doesn't order clwb A.

SFENCE 可以防止这种情况发生.

An SFENCE can prevent this.

这篇关于clwb+sfence,如果写入是缓存行对齐的,我们可以删除 sfence 吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆