如何在硬件级别实现原子操作? [英] How are atomic operations implemented at a hardware level?

查看:454
本文介绍了如何在硬件级别实现原子操作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我了解到,在汇编语言级别,指令集体系结构提供了比较,交换和类似的操作.但是,我不知道该芯片如何提供这些保证.

I get that at the assembly language level instruction set architectures provide compare and swap and similar operations. However, I don't understand how the chip is able to provide these guarantees.

按照我的想象,指令的执行必须

As I imagine it, the execution of the instruction must

  1. 从内存中获取值
  2. 比较值
  3. 根据比较结果,可能在内存中存储另一个值

是什么导致另一个内核在第一个内核获取内存地址之后但在设置新值之前无法访问该内存地址?内存控制器可以管理吗?

What prevents another core from accessing the memory address after the first has fetched it but before it sets the new value? Does the memory controller manage this?

如果x86实现是秘密的,我很高兴听到任何处理器家族如何实现它.

edit: If the x86 implementation is secret, I'd be happy to hear how any processor family implements it.

推荐答案

这是)上,对用户级别锁定的了解不多:

Here is an article over at software.intel.com on that sheds little light on user level locks:

用户级别锁定涉及利用以下原子指令 处理器以原子方式更新内存空间.原子指令 涉及在指令上使用锁前缀,并具有 分配给内存地址的目标操作数.以下 指令可以在当前Intel上以锁定前缀自动运行 处理器:ADD,ADC,AND,BTC,BTR,BTS,CMPXCHG,CMPXCH8B,DEC,INC, NEG,NOT,OR,SBB,SUB,XOR,XADD和XCHG. [...]关于大多数说明 除xchg指令外,必须明确使用锁前缀 如果指令涉及内存,则在其中隐含锁前缀 地址.

User level locks involve utilizing the atomic instructions of processor to atomically update a memory space. The atomic instructions involve utilizing a lock prefix on the instruction and having the destination operand assigned to a memory address. The following instructions can run atomically with a lock prefix on current Intel processors: ADD, ADC, AND, BTC, BTR, BTS, CMPXCHG, CMPXCH8B, DEC, INC, NEG, NOT, OR, SBB, SUB, XOR, XADD, and XCHG. [...] On most instructions a lock prefix must be explicitly used except for the xchg instruction where the lock prefix is implied if the instruction involves a memory address.

在Intel 486处理器时代,锁前缀用于断言一个 锁定公交车,并在性能上大受打击.从...开始 在Intel Pentium Pro架构中,总线锁被转换为 缓存锁.在大多数情况下,仍将在总线上声明锁定 现代架构,如果锁位于不可缓存的内存中,或者 该锁超出了缓存行边界分割缓存行的范围. 这两种情况均不太可能,因此大多数锁前缀将是 转换为便宜得多的缓存锁.

In the days of Intel 486 processors, the lock prefix used to assert a lock on the bus along with a large hit in performance. Starting with the Intel Pentium Pro architecture, the bus lock is transformed into a cache lock. A lock will still be asserted on the bus in the most modern architectures if the lock resides in uncacheable memory or if the lock extends beyond a cache line boundary splitting cache lines. Both of these scenarios are unlikely, so most lock prefixes will be transformed into a cache lock which is much less expensive.

那么,是什么阻止了另一个内核访问内存地址呢? 缓存一致性协议已经管理了缓存行的访问权限.因此,如果某个核心对高速缓存行具有(时间)独占访问权限,则其他任何核心都无法访问该高速缓存行.为了访问该高速缓存行,另一个核心必须首先获取访问权限,而获取这些权限的协议涉及当前所有者.实际上,缓存一致性协议可防止其他内核静默访问缓存行.

So what prevents another core from accessing the memory address? The cache coherency protocol already manages access rights for cache lines. So if a core has (temporal) exclusive access rights to a cache line, no other core can access that cache line. To access that cache line the other core has to obtain access rights first, and the protocol to obtain those rights involves the current owner. In effect, the cache coherency protocol prevents other cores from accessing the cache line silently.

如果锁定的访问未绑定到单个缓存行,则情况会变得更加复杂.有各种各样令人讨厌的极端情况,例如在页面边界上的锁定访问等.Intel不会透露详细信息,它们可能会使用各种技巧来使锁定更快.

If the locked access is not bound to a single cache line things get more complicated. There are all kinds of nasty corner cases, like locked accesses over page boundaries, etc. Intel does not tell details and they probably use all kinds of tricks to make locks faster.

这篇关于如何在硬件级别实现原子操作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆