为什么储物障碍被认为是昂贵的? [英] Why is a store-load barrier considered expensive?

查看:74
本文介绍了为什么储物障碍被认为是昂贵的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大多数CPU架构都会对存储加载操作进行重新排序,但是我的问题是为什么?我对存储量障碍的解释如下:

Most CPU architectures will re-order stores-load operations, but my question is why? My interpretation of a store-load barrier would look like this:

x = 50;
store_load_barrier;
y = z;

此外,与释放和获取语义相比,我不知道在无锁编程中该障碍将有多大用途.

Furthermore, I don't see how this barrier would be have much use in lock-free programming in comparison to release and acquire semantics.

推荐答案

简短答案:存储加载屏障可防止处理器推测性地执行在存储加载屏障之后直到所有之前的LOAD商店已经完成.

Short Answer: The store-load barrier prevents the processor from speculatively executing LOAD that come after a store-load barrier until all previous stores have completed.

详细信息:

存储装载屏障之所以昂贵,是因为它阻止了跨屏障对LOAD和STORE操作进行重新排序.

The reason that a store-load barrier is expensive is the it prevents the reordering of LOAD and STORE operations across the barrier.

假设您的指令序列如下:

Suppose you had an instruction sequence like the following:

...             ;; long latency operation to compute r1
ST r1, [ADDR1]  ;; store value in r1 to memory location referenced by ADDR1
LD r3, [ADDR2]  ;; load r3 with value in memory location ADDR2
...             ;; instructions that use result in r3

执行此序列时, r1 的值将是需要很长时间才能完成的操作的结果.指令 ST r1,[ADDR1] 将必须暂停,直到读取 r1 为止.同时,乱序处理器可以推测性地执行 LD r3,[ADDR2] 和其他说明(如果它们独立于先前的存储).他们实际上不会在提交存储之前提交,但是通过推测性地完成大多数工作,结果可以保存在重新排序缓冲区中,并可以更快地提交.

When this sequence executes that the value of r1 will be the result of an operation that take a long time to complete. The instruction ST r1, [ADDR1] will have to stall until r1 is read In the meantime an out-of-order processor can speculatively execute the LD r3, [ADDR2] and other instructions if they are independent of the earlier store. They won't actually commit until the store is committed, but by doing most of the work speculatively the results can be saved in the reorder buffer and ready to commit more quickly.

这对单处理器系统有效,因为CPU可以检查ADDR1和ADDR2之间是否存在依赖关系.但是在多处理器系统中,多个CPU可以独立执行加载和存储.可能有多个处理器正在执行ST至ADDR1和ADDR2的LD.如果CPU能够推测性地执行这些似乎没有依赖性的指令,那么不同的CPU可能会看到不同的结果.我认为以下博客文章很好地解释了这种情况如何发生(我不要以为我可以在这个答案中简短地总结一下.)

This works for a single-processor system because the CPU can check whether there are dependencies between ADDR1 and ADDR2. But in an multiprocessor system multiple CPUs can independently executes loads and stores. There might be multiple processors that are performing a ST to ADDR1 and a LD from ADDR2. If the CPUs are able to speculatively execute these instructions that don't appear to have dependencies then different CPUs might see different results. I think the following blog post gives a good explanation of how this can happen (I don't think it's something I could summarize succinctly in this answer).

现在考虑具有存储负载障碍的代码序列:

Now consider the code sequence that has a store-load barrier:

...             ;; long latency operation to compute r1
ST r1, [ADDR1]  ;; store value in r1 to memory location referenced by ADDR1
ST_LD_BARRIER   ;; store-load barrier
LD r3, [ADDR2]  ;; load r3 with value in memory location ADDR2
...             ;; instructions that use result in r3

这将阻止 LD r3,[ADDR2] 指令和以下从属指令被推测执行,直到之前的存储指令完成为止.这可能会降低CPU性能,因为即使在CPU本身中LD和ST之间没有依赖性,整个CPU管道也可能在等待ST指令完成时停滞.

This would prevent the LD r3, [ADDR2] instruction and following dependent instructions from being speculatively executed until the previous store instructions complete. And this could reduce the CPU performance because entire CPU pipeline might have to stall while waiting for the ST instruction to complete, even though in the CPU itself there is no dependency between the LD and the ST.

可以做一些事情来限制CPU必须停止的数量.但最重要的是,存储负载屏障在负载和存储之间创建了额外的依赖性,这限制了CPU可以执行的推测执行量.

There are some things that can be done to limit the amount that the CPU has to stall. But the bottom line is that the store-load barrier creates additional dependencies between loads and stores and this limits the amount of speculative execution that the CPU can perform.

这篇关于为什么储物障碍被认为是昂贵的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆