什么是影响SFENCE和LFENCE邻国内核缓存? [英] What is the impact SFENCE and LFENCE to caches of neighboring cores?

查看:451
本文介绍了什么是影响SFENCE和LFENCE邻国内核缓存?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

讲话香草萨特 在第2页幻灯片的身影:<一href=\"https://skydrive.live.com/view.aspx?resid=4E86B0CF20EF15AD!24884&app=WordPdf&wdo=2&authkey=!AMtj_EflYn2507c\" rel=\"nofollow\">https://skydrive.live.com/view.aspx?resid=4E86B0CF20EF15AD!24884&app=WordPdf&wdo=2&authkey=!AMtj_EflYn2507c

下面显示单独的cache的 L1S 并存储缓冲区( SB )。

1 在处理器英特尔x86的缓存L1和存储缓冲区 - 是一回事

和下一张幻灯片:

当我们从在x86只能重新排序之后的下一张幻灯片看到。
是:

  MOV EAX,[MEMORY1] / /读
MOV [MEMORY2],EDX / /写
... / / MOV,MFENCE,添加...任何其他code

变成了:

  MOV [MEMORY2],EDX / /写
MOV EAX,[MEMORY1] / /读
... / / MOV,MFENCE,添加...任何其他code

这是由于在处理器管线的无序执行

2 但是可以显示与此类似另一个例子? - 如何在重新排序存储缓冲区影响

3 并的主要问题 - 如何影响 LFENCE SFENCE 对周边内核缓存?

是正确的说法是:


  1. SFENCE 使推,即使冲洗商店Buffer-> L1,然后发送从Core0-L1 / L2的缓存变化到所有其他内核核心-1 / 2/3 ...- L1 / L2?

  2. LFENCE 使得拉,即从所有其他核1/2/3 ...- L1 / L2高速缓存接收的变化(和存储缓冲区?)我们核心Core0-L1 / L2?


解决方案

  1. 存储缓冲不是一个缓存,它是一个有序的队列。它保持未决存储,而高速缓存可以被认为是作为存储器中的逻辑部分(即, - 一切在任何高速缓存的是所有其它试剂可见,并且必须正确地回答到窥探)


  2. 店铺不重新排序,这会破坏内存排序,因为它们将成为立即可见(不像负荷谁只影响内部状态)。


  3. 围栏不上高速缓存工作,什么都没有做与其他内核。缓存已经完全可见和同步。围栏仅适用于执行顺序(如果它这样做了序内部),因此只适用于当前上下文。



  

是正确的说法是:


  
  

      
  1. SFENCE使推,即使冲洗商店Buffer-> L1,然后发送从Core0-L1 / L2的高速缓存的所有其他内核变化
      核1/2/3 ...- L1 / L2?

  2.   
  3. LFENCE使拉,即从我们的核心Core0-L1 / L2所有其他核1/2/3 ...- L1 / L2高速缓存(和存储缓冲区?)接收的变化?

  4.   

SFENCE / MFENCE将刷新存储缓冲区,因为他们不会允许挂起投机门店将保持(这就是为什么他们在击剑)。但如我所说 - 在L1一旦他们改变他们的任何人已经观察到,他们没有任何地方较远被刷新

在同样的意义,lfence不拉任何东西,它只是所有摊位年轻负载的执行,直到旧的(和围栏本身)已完成并提交。这将通过序列化的载荷影响性能,但不会以其他方式保护您免受其他内核的任何操作,除非你有另一种方式,以确保您需要将那时已经执行任何商店(在这种情况下 - 更新负载结果在时间)。

From the speech Herb Sutter in the figure of the slides on page 2: https://skydrive.live.com/view.aspx?resid=4E86B0CF20EF15AD!24884&app=WordPdf&wdo=2&authkey=!AMtj_EflYn2507c

Here are shown separate cache-L1S and Store Buffer (SB).

1. In processors Intel x86 cache-L1 and Store Buffer - is the same thing?

And next slide:

As we see from the next slide in the x86 is only possible following reordering. was:

MOV eax, [memory1] / / read
MOV [memory2], edx / / write
... / / MOV, MFENCE, ADD ... any other code

became:

MOV [memory2], edx / / write
MOV eax, [memory1] / / read
... / / MOV, MFENCE, ADD ... any other code

This is due to the unordered execution in the processor pipeline.

2. But can you show another example similar to this - how does affect on reordering Store Buffer?

3. And the main question - how to influences LFENCE and SFENCE on caches of neighboring cores?

Is correct to say that:

  1. SFENCE makes "push", ie makes flush for Store Buffer->L1, and then sends changes from the caches of Core0-L1/L2 to all other cores Core1/2/3...-L1/L2?
  2. LFENCE makes "pull", ie receives changes from caches of all other Core1/2/3...-L1/L2( and Store Buffer?) in our core Core0-L1/L2?

解决方案

  1. The store buffer is not a cache, it's an ordering queue. It holds pending stores, while the cache can be thought of as a logical part of memory (i.e. - everything in any of the caches is visible to all other agents and must answer correctly to snoops)

  2. Stores are not reordered, that would break memory ordering as they would become immediately visible (unlike loads who only affect internal state).

  3. fences do not work on caches, and have nothing to do with other cores. Caches are already fully visible and synched. fences only apply for execution order (in case it's done out-of-order internally), and therefore apply only for the current context.

Is correct to say that:

  1. SFENCE makes "push", ie makes flush for Store Buffer->L1, and then sends changes from the caches of Core0-L1/L2 to all other cores Core1/2/3...-L1/L2?
  2. LFENCE makes "pull", ie receives changes from caches of all other Core1/2/3...-L1/L2( and Store Buffer?) in our core Core0-L1/L2?

sfence/mfence would flush the store buffer as they won't allow pending speculative stores to remain (that's why they're fencing). However as I said - once they changes are in L1 they're already observable by anyone, they don't have to be flushed anywhere further away.

In the same sense, lfence doesn't "pull" anything, it just stalls the execution of all younger loads until the older ones (and the fence itself) have finished and committed. This will affect performance by serializing the loads, but would not otherwise protect you against any operation in other cores, unless you have another way to make sure any store you require would have been performed by then (and in that case - update the load result in time).

这篇关于什么是影响SFENCE和LFENCE邻国内核缓存?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆