多线程程序停留在优化模式下,但在-O0下正常运行 [英] Multithreading program stuck in optimized mode but runs normally in -O0

查看:163
本文介绍了多线程程序停留在优化模式下,但在-O0下正常运行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我编写了一个简单的多线程程序,如下所示:

I wrote a simple multithreading programs as follows:

static bool finished = false;

int func()
{
    size_t i = 0;
    while (!finished)
        ++i;
    return i;
}

int main()
{
    auto result=std::async(std::launch::async, func);
    std::this_thread::sleep_for(std::chrono::seconds(1));
    finished=true;
    std::cout<<"result ="<<result.get();
    std::cout<<"\nmain thread id="<<std::this_thread::get_id()<<std::endl;
}

它在 Visual studio 的调试模式下或 gc c的-O0下正常运行,并在1秒后打印出结果.但是它卡住了,并且在 发布 模式或-O1 -O2 -O3模式下不打印任何内容.

It behaves normally in debug mode in Visual studio or -O0 in gcc and print out the result after 1 seconds. But it stuck and does not print anything in Release mode or -O1 -O2 -O3.

推荐答案

UB 这与finished有关.您可以使finished类型为std::atomic<bool>来解决此问题.

Two threads, accessing a non-atomic, non-guarded variable are U.B. This concerns finished. You could make finished of type std::atomic<bool> to fix this.

我的解决方法:

#include <iostream>
#include <future>
#include <atomic>

static std::atomic<bool> finished = false;

int func()
{
    size_t i = 0;
    while (!finished)
        ++i;
    return i;
}

int main()
{
    auto result=std::async(std::launch::async, func);
    std::this_thread::sleep_for(std::chrono::seconds(1));
    finished=true;
    std::cout<<"result ="<<result.get();
    std::cout<<"\nmain thread id="<<std::this_thread::get_id()<<std::endl;
}

输出:

result =1023045342
main thread id=140147660588864

大肠杆菌上的实时演示

有人可能会认为'这是bool–大概一点.这怎么可能是非原子的? (我是从多线程开始的.)

Somebody may think 'It's a bool – probably one bit. How can this be non-atomic?' (I did when I started with multi-threading myself.)

但是请注意,缺乏训练不是std::atomic给您的唯一东西.它还使来自多个线程的并发读写访问权限得到了明确定义,从而阻止了编译器假设重新读取该变量将始终看到相同的值.

But note that lack-of-tearing is not the only thing that std::atomic gives you. It also makes concurrent read+write access from multiple threads well-defined, stopping the compiler from assuming that re-reading the variable will always see the same value.

使bool不受保护,没有原子性会导致其他问题:

Making a bool unguarded, non-atomic can cause additional issues:

  • 编译器可能会决定将变量优化到寄存器中,甚至将CSE多次访问优化到一个寄存器中,并从循环中提升负载.
  • 该变量可能已为CPU内核缓存. (在现实生活中, CPU具有一致的缓存.这不是一个真正的问题,但是C ++标准足够宽松,无法涵盖非一致性共享内存上的假设C ++实现,在这种情况下,atomic<bool>memory_order_relaxed可以存储/加载,而volatile则不能.即使实际上它可以在实际的C ++实现中使用,它也将是UB.)
  • The compiler might decide to optimize variable into a register or even CSE multiple accesses into one and hoist a load out of a loop.
  • The variable might be cached for a CPU core. (In real life, CPUs have coherent caches. This is not a real problem, but the C++ standard is loose enough to cover hypothetical C++ implementations on non-coherent shared memory where atomic<bool> with memory_order_relaxed store/load would work, but where volatile wouldn't. Using volatile for this would be UB, even though it works in practice on real C++ implementations.)

为防止这种情况发生,必须明确告知编译器不要这样做.

To prevent this to happen, the compiler must be told explicitly not to do.

对于关于volatile与该问题的潜在关系的不断发展的讨论,我感到有些惊讶.因此,我想花掉我的两分钱:

I'm a little bit surprised about the evolving discussion concerning the potential relation of volatile to this issue. Thus, I'd like to spent my two cents:

  • Is volatile useful with threads
  • Who's afraid of a big bad optimizing compiler?.

这篇关于多线程程序停留在优化模式下,但在-O0下正常运行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆