锁定的指令是否在弱顺序访问之间提供了障碍? [英] Do locked instructions provide a barrier between weakly-ordered accesses?

查看:84
本文介绍了锁定的指令是否在弱顺序访问之间提供了障碍?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在x86上,lock前缀的指令(例如lock cmpxchg)除了提供原子操作外,还提供屏障语义:对于回写存储器区域上的常规存储器访问,读写操作不会在lock之间重新排序-prefixed指令,根据Intel SDM第3卷的8.2.2节:

On x86, lock-prefixed instructions such as lock cmpxchg provide barrier semantics in addition to their atomic operation: for normal memory access on write-back memory regions, reads and writes are not re-ordered across lock-prefixed instructions, per section 8.2.2 of Volume 3 of the Intel SDM:

无法使用I/O指令,锁定指令或序列化指令对读取或写入进行重新排序.

Reads or writes cannot be reordered with I/O instructions, locked instructions, or serializing instructions.

本节仅适用于回写存储器类型.在同一列表中,您会发现一个异常,其中指出弱排序的商店未排序:

This section applies only to write-back memory types. In the same list, you find an exception where it notes that weakly ordered stores are not ordered:

  • 读取不会与其他读取重新排序.
  • 写入未重新排序 与较旧的阅读.
  • 对存储器的写入不会与其他存储器重新排序 写入,但以下情况除外:—
  • Reads are not reordered with other reads.
  • Writes are not reordered with older reads.
  • Writes to memory are not reordered with other writes, with the following exceptions: —

使用非临时移动指令(MOVNTI,MOVNTQ,MOVNTDQ,MOVNTPS和MOVNTPD)执行的流存储(写入);和—

streaming stores (writes) executed with the non-temporal move instructions (MOVNTI, MOVNTQ, MOVNTDQ, MOVNTPS, and MOVNTPD); and —

字符串操作(请参阅第8.2.4.1节).

string operations (see Section 8.2.4.1).

请注意,列表中的任何其他项(例如,引用锁定前缀的指令的项)中的非临时性指令也不例外.

Note that there is no exception made for non-temporal instructions in any other items in the list, e.g., in the item referring to lock-prefixed instructions.

在本指南的其他各个部分中,提到使用弱排序(非时间)指令时,可以使用mfence和/或sfence指令对存储器进行排序.这些部分通常不会提及lock前缀的指令作为替代.

In various other sections of the guide, it is mentioned that the mfence and/or sfence instructions can be used to order memory when weakly ordered (non-temporal) instructions are used. These sections generally don't mention lock-prefixed instruction as an alternative.

所有使我不确定的地方:lock前缀的指令是否提供与mfence在WB存储器上的弱指令(非时间指令)之间提供的完全障碍?同样的问题再次适用,但适用于WC存储器上的任何类型的访问.

All that leaves me uncertain: do lock-prefixed instructions provide the same full barrier that mfence provides between weakly ordered (non-temporal) instructions on WB memory? The same question applies again but to any type of access on WC memory.

推荐答案

在所有64位AMD处理器上,MFENCE是完全序列化的指令,而Lock-prefix指令不是.但是,两者都根据AMD手册V2 7.4.2序列化了所有内存访问:

On all 64-bit AMD processors, MFENCE is a fully serializing instruction and the Lock-prefixed instructions are not. However, both serialize all memory accesses according to the AMD manual V2 7.4.2:

之前所有以前的加载和存储都完整地存储到内存或I/O空间中 发出用于I/O,锁定或序列化指令的内存访问.

All previous loads and stores complete to memory or I/O space before a memory access for an I/O, locked or serializing instruction is issued.

与I/O和锁定指令关联的所有装载和存储 从存储器加载或存储之前已完成存储(无缓冲存储) 发出后续指示.

All loads and stores associated with the I/O and locked instructions complete to memory (no buffered stores) before a load or store from a subsequent instruction is issued.

这些指令的序列化属性没有任何例外或错误.

There are no exceptions or erratum related to the serialization properties of these instructions.

从英特尔手册和文档中可以清楚地看到,这两个序列均对所有商店进行了序列化,没有任何例外或相关的错误. MFENCE还序列化了所有负载,对于大多数基于Skylake,Kaby Lake和Coffee Lake微体系结构的处理器,有一份勘误记录在案,该声明指出WC内存中的MOVNTDQA可能会传递早期的MFENCE指令.此外,许多基于Nehalem,Sandy Bridge,Ivy Bridge,Haswell,Broadwell,Skylake,Kaby Lake,Coffee Lake和Silvermont微体系结构的处理器都有一个勘误表,指出WC内存中的MOVNTDQA可能会传递先前的锁定指令.基于Core,Westmere,Sunny Cove和Goldmont微体系结构的处理器没有此勘误表.

It's clear from the Intel manual and documents that both serialize all stores with no exceptions or related erratum. MFENCE also serializes all loads, with one errata documented for most processors based on Skylake, Kaby Lake, and Coffee Lake microarchitectures, which states that MOVNTDQA from WC memory may passs earlier MFENCE instructions. In addition, many processors based on the Nehalem, Sandy Bridge, Ivy Bridge, Haswell, Broadwell, Skylake, Kaby Lake, Coffee Lake, and Silvermont microarchitectures have an errata that says that MOVNTDQA from WC memory may passs earlier locked instructions. Processors based on the Core, Westmere, Sunny Cove, and Goldmont microarchitectures don't have this errata.

Necrolis的答案引述为:锁前缀可能无法序列化引用Pentium 4处理器上弱排序的内存类型的加载操作.我的理解是,这看起来像是奔腾4处理器中的错误,并且不适用于其他任何处理器.尽管值得注意的是,它没有在奔腾4处理器的规范更新文档中进行记录.

The quote from Necrolis's answer says that the lock prefix may not serialize load operations that reference weakly ordered memory types on the Pentium 4 processors. My understanding is that this looks like a bug in the Pentium 4 processors and it doesn't apply to any other processors. Although it's worth noting that it's not documented in the spec update documents of the Pentium 4 processors.

@PeterCordes的实验显示,在Skylake上,锁定指令似乎并没有阻止ALU指令被无序执行,而mfence确实对ALU指令进行了序列化(可能表现为与lfence相同,+存储缓冲区刷新类似于锁定指令).但是,我认为这是一个实现细节.

@PeterCordes's experiments show that, on Skylake, locking instructions don't seem to block ALU instructions from being executed out-of-order while mfence does serialize ALU instructions (potentially behaving identically to lfence + a store-buffer flush like a locked instruction). However, I think this is an implementation detail.

这篇关于锁定的指令是否在弱顺序访问之间提供了障碍?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆