在x86/x86_64处理器上使用LFENCE指令有意义吗? [英] Does it make any sense to use the LFENCE instruction on x86/x86_64 processors?

查看:334
本文介绍了在x86/x86_64处理器上使用LFENCE指令有意义吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

通常在互联网上,我发现LFENCE在x86处理器中毫无意义,也就是说,它什么也不做,因此MFENCE我们绝对可以轻松使用SFENCE,因为MFENCE = SFENCE + = SFENCE + NOP = SFENCE.

Often in internet I find that LFENCE makes no sense in processors x86, ie it does nothing , so instead MFENCE we can absolutely painless to use SFENCE, because MFENCE = SFENCE + LFENCE = SFENCE + NOP = SFENCE.

但是如果LFENCE没有意义,那么为什么我们有四种方法在x86/x86_64中实现顺序一致性:

But if LFENCE does not make sense, then why we have four approaches to make Sequential Consistency in x86/x86_64:

  1. LOAD(无围栏)和STORE + MFENCE
  2. LOAD(无围栏)和LOCK XCHG
  3. MFENCE + LOADSTORE(无围栏)
  4. LOCK XADD(0)和STORE(无围栏)
  1. LOAD (without fence) and STORE + MFENCE
  2. LOAD (without fence) and LOCK XCHG
  3. MFENCE + LOAD and STORE (without fence)
  4. LOCK XADD ( 0 ) and STORE (without fence)

来自此处: http://www.cl.cam.ac.uk/〜pes20/cpp/cpp0xmappings.html

以及底部第34页的Herb Sutter的表演:

As well as performances from Herb Sutter on page 34 at the bottom: https://skydrive.live.com/view.aspx?resid=4E86B0CF20EF15AD!24884&app=WordPdf&wdo=2&authkey=!AMtj_EflYn2507c

如果LFENCE没有执行任何操作,则方法(3)具有以下含义:SFENCE + LOAD and STORE (without fence),但是在LOAD之前执行SFENCE毫无意义.即,如果LFENCE不执行任何操作,则方法(3)没有意义.

If LFENCE did not do anything, then the approach (3) would have the following meanings: SFENCE + LOAD and STORE (without fence), but there is no point in doing SFENCE before LOAD. Ie if LFENCE does nothing , the approach (3) does not make sense.

它在处理器x86/x86_64中是否有意义LFENCE指令?

Does it make any sense instruction LFENCE in processors x86/x86_64?

答案:

1. LFENCE在下面接受的答案所述的情况下是必需的.

1. LFENCE required in cases which described in the accepted answer, below.

2..方法(3)不应单独查看,而应与先前的命令结合使用.例如,方法(3):

2. The approach (3) should be viewed not independently, but in combination with the previous commands. For example, approach (3):

MFENCE
MOV reg, [addr1]  // LOAD-1
MOV [addr2], reg  //STORE-1

MFENCE
MOV reg, [addr1]  // LOAD-2
MOV [addr2], reg  //STORE-2

我们可以如下重写方法(3)的代码:

We can rewrite the code of approach (3) as follows:

SFENCE
MOV reg, [addr1]  // LOAD-1
MOV [addr2], reg  //STORE-1

SFENCE
MOV reg, [addr1]  // LOAD-2
MOV [addr2], reg  //STORE-2

在这里,SFENCE可以防止对STORE-1和LOAD-2重新排序.为此,在STORE-1命令之后,SFENCE刷新存储缓冲区.

And here SFENCE makes sense to prevent reordering STORE-1 and LOAD-2. For this after STORE-1 command SFENCE flushes Store-Buffer.

推荐答案

底线(TL; DR):单独LFENCE确实对内存排序毫无用处,但是它不能使SFENCE代替MFENCE .问题中的算术"逻辑不适用.

Bottom line (TL;DR): LFENCE alone indeed seems useless for memory ordering, however it does not make SFENCE a substitute for MFENCE. The "arithmetic" logic in the question is not applicable.

这是另一个答案中使用

Here is an excerpt from Intel's Software Developers Manual, volume 3, section 8.2.2 (the edition 325384-052US of September 2014), the same that I used in another answer

  • 读取不会与其他读取重新排序.
  • 写入不会随着较早的读取而重新排序.
  • 对内存的写入不会与其他写入一起重新排序,但以下情况除外:
    • 使用CLFLUSH指令执行的写入;
    • 使用非临时移动指令(MOVNTI,MOVNTQ,MOVNTDQ,MOVNTPS和MOVNTPD)执行的流存储(写入);和
    • 字符串操作(请参阅第8.2.4.1节).
  • Reads are not reordered with other reads.
  • Writes are not reordered with older reads.
  • Writes to memory are not reordered with other writes, with the following exceptions:
    • writes executed with the CLFLUSH instruction;
    • streaming stores (writes) executed with the non-temporal move instructions (MOVNTI, MOVNTQ, MOVNTDQ, MOVNTPS, and MOVNTPD); and
    • string operations (see Section 8.2.4.1).

从这里开始:

  • MFENCE是所有内存类型上的所有操作(无论是否为非临时性)的完整内存防护.
  • SFENCE仅阻止对写进行重新排序(换句话说,这是StoreStore的障碍),并且仅与非临时存储和其他列为异常的指令一起使用.
  • LFENCE防止对读取和后续读取和写入进行重新排序(即,它结合了LoadLoad和LoadStore屏障).但是,前两个项目符号说LoadLoad和LoadStore障碍始终存在,没有例外.因此,单独的LFENCE对于内存排序没有用.
  • MFENCE is a full memory fence for all operations on all memory types, whether non-temporal or not.
  • SFENCE only prevents reordering of writes (in other terminology, it's a StoreStore barrier), and is only useful together with non-temporal stores and other instructions listed as exceptions.
  • LFENCE prevents reordering of reads with subsequent reads and writes (i.e. it combines LoadLoad and LoadStore barriers). However, the first two bullets say that LoadLoad and LoadStore barriers are always in place, no exceptions. Therefore LFENCE alone is useless for memory ordering.

为支持最后一项主张,我查看了英特尔手册的所有3卷中所有提到LFENCE的地方,但没有发现任何内容表明LFENCE是内存一致性所必需的.甚至MOVNTDQA-到目前为止唯一的非时间加载指令-都提到MFENCE,但没有提到LFENCE.

To support the last claim, I looked at all places where LFENCE is mentioned in all 3 volumes of Intel's manual, and found none which would say that LFENCE is required for memory consistency. Even MOVNTDQA - the only non-temporal load instruction so far - mentions MFENCE but not LFENCE.

更新:在>中查看答案还是不是?)SFENCE + LFENCE等同于MFENCE吗?以获得以下猜测的正确答案

Update: see answers on Why is (or isn't?) SFENCE + LFENCE equivalent to MFENCE? for correct answers to the guesswork below

MFENCE是否等于其他两个围栏的和"是一个棘手的问题.乍一看,在三个篱笆指令中,只有MFENCE提供了StoreLoad屏障,即防止使用较早的写入对读取进行重新排序.然而,正确的答案需要了解的不仅仅是上述规则;也就是说,重要的是所有栅栏指令都必须相对于彼此排序.这使得SFENCE LFENCE序列比仅仅结合单个效果更强大:此序列还可以防止StoreLoad重新排序(因为加载不能通过LFENCE,不能通过SFENCE,不能通过存储),因此构成了完整的内存挡板(另请参阅下面的注释(*)).但是请注意,这里的顺序很重要,LFENCE SFENCE序列没有相同的协同作用.

Whether MFENCE is equivalent to a "sum" of other two fences or not is a tricky question. At glance, among the three fence instructions only MFENCE provides StoreLoad barrier, i.e. prevents reordering of reads with earlier writes. However the correct answer requires to know more than the above rules; namely, it's important that all fence instructions are ordered with respect to each other. This makes the SFENCE LFENCE sequence more powerful than a mere union of individual effects: this sequence also prevents StoreLoad reordering (because loads cannot pass LFENCE, which cannot pass SFENCE, which cannot pass stores), and thus constitutes a full memory fence (but also see the note (*) below). Note however that order matters here, and the LFENCE SFENCE sequence does not have the same synergy effect.

但是,虽然可以说MFENCE ~ SFENCE LFENCELFENCE ~ NOP,但这并不意味着MFENCE ~ SFENCE.我故意使用等价(〜)而不是等号(=)来强调算术规则不适用于此处. SFENCELFENCE的互作用会有所不同;即使负载没有相互重新排序,也需要使用LFENCE来防止使用SFENCE对负载重新排序.

However, while one can say that MFENCE ~ SFENCE LFENCE and LFENCE ~ NOP, that does not mean MFENCE ~ SFENCE. I deliberately use equivalence (~) and not equality (=) to stress that arithmetic rules do not apply here. The mutual effect of SFENCE followed by LFENCE makes the difference; even though loads are not reordered with each other, LFENCE is required to prevent reordering of loads with SFENCE.

(*)说MFENCE比其他两个围栏的组合要强仍然是正确的.特别是,英特尔手册第2卷中对CLFLUSH指令的注释指出:"CLFLUSH仅由MFENCE指令排序.不能保证由任何其他防护或序列化指令或其他命令来排序" CLFLUSH指令."

(*) It still might be correct to say that MFENCE is stronger than the combination of the other two fences. In particular, a note to CLFLUSH instruction in the volume 2 of Intel's manual says that "CLFLUSH is only ordered by the MFENCE instruction. It is not guaranteed to be ordered by any other fencing or serializing instructions or by another CLFLUSH instruction."

(更新,clflush现在被定义为有序排列(就像普通商店一样,因此如果要阻止以后的 loads ,则只需要mfence即可),但是clflushopt的作用很弱有序,但可以用sfence围起来.)

(Update, clflush is now defined as strongly ordered (like a normal store, so you only need mfence if you want to block later loads), but clflushopt is weakly ordered, but can be fenced by sfence.)

这篇关于在x86/x86_64处理器上使用LFENCE指令有意义吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆