确保与std :: hardware_constructive_interference_size共享的正确方法 [英] Correct way to ensure sharing with std::hardware_constructive_interference_size

查看:406
本文介绍了确保与std :: hardware_constructive_interference_size共享的正确方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

什么是正确且可移植的方法,以确保在足够小以适合缓存线的结构中真正共享?仅确保结构足够小就足够了吗?还是必须在缓存边界上对齐?

What is the correct and portable way to ensure true sharing in a struct small enough to fit in a cacheline? Is it enough to just ensure that the struct is small enough? Or does it also have to be aligned on the cache boundary?

例如,假设缓存行的大小为64个字节,那么下面的内容是否足够?

For example, assuming the size of a cacheline is 64 bytes, is the following enough?

struct A {
  std::uint32_t one;
  std::uint32_t two;
};

还是我必须这样做?

struct alignas(std::hardware_constructive_interference_size) A {
  std::uint32_t one;
  std::uint32_t two;
};

注意:这将始终在堆栈上,因此不需要任何过度对齐的内存分配。

Note: This will always be on the stack, so no over-aligned memory allocations should be required.

另一项后续工作,是否足以确保不存在虚假共享?

Another followup, is this enough to ensure no false-sharing?

struct A {
public:
  alignas(hardware_destructive_interference_size) std::uint32_t one;
  alignas(hardware_constructive_interference_size) std::uint32_t two;
};

还是必须这样做(例如 hardware_constructive_interference_size < hardware_destructive_interference_size ?)

or does one have to do this (in the case where say hardware_constructive_interference_size < hardware_destructive_interference_size?)

struct A {
public:
  alignas(hardware_destructive_interference_size) std::uint32_t one;
  alignas(hardware_destructive_interference_size) std::uint32_t two;
};


推荐答案

第二种方法目前是您所能做到的。

The second variant is currently the best you can do.

但是,没有100%可移植的方式来对齐缓存行大小。常量 hardware_constructive_interference_size hardware_destructive_interference_size 只是提示。它们是编译器的最佳猜测。最终,您在编译时不知道L1高速缓存行的大小。

However, there is no 100% portable way to align to cache line sizes. The constants hardware_constructive_interference_size and hardware_destructive_interference_size are just hints. They are best guesses of the compiler. Ultimately you do not know the L1 cache line size at compile time.

但是实际上这通常并不重要,因为对于大多数体系结构来说,都有典型的缓存行大小,例如x86为64字节。

But in practice this usually does not matter, since for most architectures there is a typical cache line size, like 64 bytes for x86.

甚至更多,对于像您的示例这样的小型结构,自然对齐结构以确保它完全在缓存行内总是足够的。在您的具体示例中,这意味着

Even more, for small structs like in your example, it is always sufficient to naturally align the struct to make sure it is completely within a cache line. In your concrete example this means that

struct alignas(8) A {
  std::uint32_t one;
  std::uint32_t two;
};

将始终确保真正的共享,而不管运行时的实际L1缓存行大小如何,只要缓存行大小为8个字节或更大。 (如果它较小,您将永远不会真正拥有真正的共享。)

will always ensure true sharing, regardless of the actual L1 cache line size at runtime, provided that the cache line size is 8 bytes or bigger. (If it is smaller you will never have true sharing trivially.)

关于后续问题:第二种变体将确保没有错误共享。第一种变体可能会导致错误共享,因为缓存行大小实际上可能是 hardware_destructive_interference_size ,在这种情况下,您将拥有错误共享(假设 hardware_constructive_interference_size < hardware_destructive_interference_size )。

Regarding the follow-up question: The second variant will ensure no false-sharing. The first variant may result in false sharing as the cache line size may really be hardware_destructive_interference_size in which case you will have false-sharing (under the assumption that hardware_constructive_interference_size < hardware_destructive_interference_size).

但实际上 hardware_destructive_interference_size hardware_constructive_interference_size 对于大多数体系结构将具有相同的值。鉴于这两个常量都不能为您提供实际的L1缓存行大小,而只是提供了编译时的猜测,因此这有点工程化。

But in practice hardware_destructive_interference_size and hardware_constructive_interference_size will have the same value for most architectures. This is somewhat over engineered, given that neither constant provides you with the real L1 cache line size, but just with a compile-time guess.

这篇关于确保与std :: hardware_constructive_interference_size共享的正确方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆