C ++ 11引入了标准化的内存模型。这是什么意思?它如何影响C ++编程? [英] C++11 introduced a standardized memory model. What does it mean? And how is it going to affect C++ programming?

查看:102
本文介绍了C ++ 11引入了标准化的内存模型。这是什么意思?它如何影响C ++编程?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

C ++ 11引入了一个标准化的内存模型,但是究竟是什么意思?



Herb Sutter说在这里


内存模型意味着C ++代码
现在具有一个标准化的库调用
无论谁编译器
和它运行的平台是谁。
有一个标准的方法来控制如何
不同的线程与
处理器的内存谈话。



当你谈论拆分
[code]在不同的核心,
在标准中,我们在谈论
的内存模型。我们将
优化它,而不打破
假设人在代码中要做
,Sutter说。


好吧,我可以记住这个和类似的段落可在网上(因为我有自己的记忆模型从出生:P),甚至可以发布为回答他人提出的问题,但老实说,我不完全明白这一点。



所以,我基本上想知道的是,C ++程序员以前用来开发多线程应用程序,那么它的POSIX线程或Windows线程,或C ++ 11线程?有什么好处?我想了解底层的细节。



我也觉得C ++ 11内存模型与C ++ 11多线程支持有些相关,因为我经常看到这两个在一起。如果是,究竟是什么?他们为什么要相关?



由于我不知道多线程的内部如何工作,以及什么内存模型一般意味着,请帮助我理解这些概念。 : - )

解决方案

首先,你必须学会​​像语言律师一样思考。



C ++规范不涉及任何特定的编译器,操作系统或CPU。它引用了作为实际系统的泛化的抽象机 。在语言律师世界中,程序员的工作是为抽象机器编写代码;编译器的工作是在具体机器上实现该代码。通过对规范的严格编码,你可以确定你的代码将编译和运行没有修改任何系统与兼容的C ++编译器,无论是今天还是50年后。



C ++ 98 / C ++ 03规范中的抽​​象机基本上是单线程的。因此,不可能编写完全便携式的多线程C ++代码。规格甚至不说任何关于内存加载和存储的原子性或加载和存储可能发生的顺序,不介意像互斥体。



当然,你可以在实践中为特定的具体系统编写多线程代码,例如pthreads或Windows。但是没有用于编写C ++ 98 / C ++ 03多线程代码的标准方式。



C中的抽象机++ 11是多线程设计。它还有一个定义明确的内存模型;也就是说,它说的是编译器在访问内存时可能和不可以做的。



考虑下面的例子,其中一对全局变量被并发访问两个线程:

  Global 
int x,y;

主题1主题2
x = 17; cout < y < ;
y = 37; cout < x<< endl;

线程2输出什么?



在C ++ 98 / C ++ 03下,这甚至不是未定义的行为;



在C ++ 11下,结果是未定义的行为,因为加载和存储一般不需要是原子的。



但是对于C ++ 11,你可以这样写:

 全局
atomic< int> x,y;

线程1线程2
x.store(17); cout < y.load()<< ;
y.store(37); cout < x.load()<< endl;

现在事情变得更有趣了。首先,这里的行为是定义。线程2现在可以打印 0 0 (如果它在线程1之前运行), 37 17 1)或 0 17 (如果它在线程1分配给x之后但在它分配给y之前运行)。



无法打印的是 37 0 ,因为C ++ 11中的原子装载/存储的默认模式是强制顺序一致性。这只是意味着所有加载和存储必须是仿佛,它们发生在你在每个线程中写它们的顺序,而线程之间的操作可以交织,而系统喜欢。因此,atomics的默认行为为加载和存储提供了原子性顺序



CPU,确保顺序一致性可能是昂贵的。特别是,编译器很可能在这里的每个访问之间发出全面的内存屏障。但是如果你的算法可以容忍无序的加载和存储;即,如果其需要原子性但不排序;即,如果它可以容忍 37 0 作为此程序的输出,则可以写为:

  Global 
atomic< int> x,y;

线程1线程2
x.store(17,memory_order_relaxed); cout < y.load(memory_order_relaxed)<< ;
y.store(37,memory_order_relaxed); cout < x.load(memory_order_relaxed)<< endl;

CPU越现代,越可能比上一个例子更快。 p>

最后,如果你只需要按顺序保存特定的加载和存储,你可以这样写:

  Global 
atomic< int> x,y;

线程1线程2
x.store(17,memory_order_release); cout < y.load(memory_order_acquire)<< ;
y.store(37,memory_order_release); cout < x.load(memory_order_acquire)<< endl;

这让我们回到有序加载和存储 - 所以 37 0 不再是一个可能的输出 - 但它是这样做的最小开销。 (在这个琐碎的例子中,结果与完全顺序一致性相同;在更大的程序中,结果不会是。)



当然,你想看到的输出是 0 0 37 17 ,你可以在原始代码周围包裹互斥体。但是如果你已经阅读了这篇文章,我敢打赌你已经知道它是如何工作的,这个答案已经比我想要的更长了: - )。



所以,底线。互斥是伟大的,C ++ 11使它们标准化。但有时出于性能原因,您需要低级原语(例如,经典的双重检查锁定模式)。新标准提供了高级别的小工具,如互斥和条件变量,它还提供了低级别的小工具,如原子类型和各种内存障碍。现在,您可以完全在标准指定的语言中编写复杂的高性能并发例程,并且您可以确定您的代码将在今天的系统和明天的系统上编译和运行不变。



虽然坦率地说,除非你是一个专家,并在一些严重的低级代码,你应该坚持互斥和条件变量。这是我打算做的。



有关这方面的更多信息,请参阅本博文


C++11 introduced a standardized memory model, but what exactly does that mean? And how is it going to affect C++ programming?

Herb Sutter says here that,

The memory model means that C++ code now has a standardized library to call regardless of who made the compiler and on what platform it's running. There's a standard way to control how different threads talk to the processor's memory.

"When you are talking about splitting [code] across different cores that's in the standard, we are talking about the memory model. We are going to optimize it without breaking the following assumptions people are going to make in the code," Sutter said.

Well, I can memorize this and similar paragraphs available online (as I've my own memory model since birth :P) and can even post as answer to questions asked by others, but to be honest, I don't exactly understand this.

So, what I basically want to know is, C++ programmers used to develop multi-threaded applications even before, so how does it matter if its POSIX threads, or Windows threads, or C++11 threads? What are the benefits? I want to understand the low-level details.

I also get this feeling that the C++11 memory model is somehow related to C++11 multi-threading support, as I often see these two together. If it is, how exactly? Why should they be related?

As I don't know how internals of multi-threading works, and what memory model means in general, please help me understand these concepts. :-)

解决方案

First, you have to learn to think like a Language Lawyer.

The C++ specification does not make reference to any particular compiler, operating system, or CPU. It makes reference to an abstract machine that is a generalization of actual systems. In the Language Lawyer world, the job of the programmer is to write code for the abstract machine; the job of the compiler is to actualize that code on a concrete machine. By coding rigidly to the spec, you can be certain that your code will compile and run without modification on any system with a compliant C++ compiler, whether today or 50 years from now.

The abstract machine in the C++98/C++03 specification is fundamentally single-threaded. So it is not possible to write multi-threaded C++ code that is "fully portable" with respect to the spec. The spec does not even say anything about the atomicity of memory loads and stores or the order in which loads and stores might happen, never mind things like mutexes.

Of course, you can write multi-threaded code in practice for particular concrete systems -- like pthreads or Windows. But there is no standard way to write multi-threaded code for C++98/C++03.

The abstract machine in C++11 is multi-threaded by design. It also has a well-defined memory model; that is, it says what the compiler may and may not do when it comes to accessing memory.

Consider the following example, where a pair of global variables are accessed concurrently by two threads:

           Global
           int x, y;

Thread 1            Thread 2
x = 17;             cout << y << " ";
y = 37;             cout << x << endl;

What might Thread 2 output?

Under C++98/C++03, this is not even Undefined Behavior; the question itself is meaningless because the standard does not contemplate anything called a "thread".

Under C++11, the result is Undefined Behavior, because loads and stores need not be atomic in general. Which may not seem like much of an improvement... And by itself, it's not.

But with C++11, you can write this:

           Global
           atomic<int> x, y;

Thread 1                 Thread 2
x.store(17);             cout << y.load() << " ";
y.store(37);             cout << x.load() << endl;

Now things get much more interesting. First of all, the behavior here is defined. Thread 2 could now print 0 0 (if it runs before Thread 1), 37 17 (if it runs after Thread 1), or 0 17 (if it runs after Thread 1 assigns to x but before it assigns to y).

What it cannot print is 37 0, because the default mode for atomic loads/stores in C++11 is to enforce sequential consistency. This just means all loads and stores must be "as if" they happened in the order you wrote them within each thread, while operations among threads can be interleaved however the system likes. So the default behavior of atomics provides both atomicity and ordering for loads and stores.

Now, on a modern CPU, ensuring sequential consistency can be expensive. In particular, the compiler is likely to emit full-blown memory barriers between every access here. But if your algorithm can tolerate out-of-order loads and stores; i.e., if it requires atomicity but not ordering; i.e., if it can tolerate 37 0 as output from this program, then you can write this:

           Global
           atomic<int> x, y;

Thread 1                            Thread 2
x.store(17,memory_order_relaxed);   cout << y.load(memory_order_relaxed) << " ";
y.store(37,memory_order_relaxed);   cout << x.load(memory_order_relaxed) << endl;

The more modern the CPU, the more likely this is to be faster than the previous example.

Finally, if you just need to keep particular loads and stores in order, you can write:

           Global
           atomic<int> x, y;

Thread 1                            Thread 2
x.store(17,memory_order_release);   cout << y.load(memory_order_acquire) << " ";
y.store(37,memory_order_release);   cout << x.load(memory_order_acquire) << endl;

This takes us back to the ordered loads and stores -- so 37 0 is no longer a possible output -- but it does so with minimal overhead. (In this trivial example, the result is the same as full-blown sequential consistency; in a larger program, it would not be.)

Of course, if the only outputs you want to see are 0 0 or 37 17, you can just wrap a mutex around the original code. But if you have read this far, I bet you already know how that works, and this answer is already longer than I intended :-).

So, bottom line. Mutexes are great, and C++11 standardizes them. But sometimes for performance reasons you want lower-level primitives (e.g., the classic double-checked locking pattern). The new standard provides high-level gadgets like mutexes and condition variables, and it also provides low-level gadgets like atomic types and the various flavors of memory barrier. So now you can write sophisticated, high-performance concurrent routines entirely within the language specified by the standard, and you can be certain your code will compile and run unchanged on both today's systems and tomorrow's.

Although to be frank, unless you are an expert and working on some serious low-level code, you should probably stick to mutexes and condition variables. That's what I intend to do.

For more on this stuff, see this blog post.

这篇关于C ++ 11引入了标准化的内存模型。这是什么意思?它如何影响C ++编程?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆