为什么需要多个shared_future对象来同步数据 [英] why are multiple shared_future objects needed to synchronize data

查看:43
本文介绍了为什么需要多个shared_future对象来同步数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

指向数据结构的指针通过 std :: promise std :: shared_future 与多个线程共享.从Anthony Williams的书" C ++并发操作"(第85-86页)看来,只有当每个接收线程使用一个副本时,数据才正确同步. std :: shared_future 对象的对象,而不是每个线程访问单个全局 std :: shared_future 的线程.

为说明起见,请考虑一个创建 bigdata 的线程并将一个指针传递给具有只读访问权限的多个线程.如果未正确处理线程之间的数据同步,则内存重新排序可能会导致不确定的行为(例如, worker_thread 读取不完整的数据).

此(不正确的?)实现使用单个全局的 std :: shared_future :

  #include< future>struct bigdata {...};std :: shared_future< bigdata *>global_sf;无效worker_thread(){const bigdata * ptr = global_sf.get();...//ptr只读访问}int main(){std :: promise< bigdata *>pr;global_sf = pr.get_future().share();std :: thread t1 {worker_thread};std :: thread t2 {worker_thread};pr.set_value(新的大数据);...} 

在此(正确的)实现中,每个 worker_thread 均获得 std :: shared_future :

的副本

  void worker_thread(std :: shared_future< bigdata *> sf){const bigdata * ptr = sf.get();...}int main(){std :: promise< bigdata *>pr;自动sf = pr.get_future().share();std :: thread t1 {worker_thread,sf};std :: thread t2 {worker_thread,sf};pr.set_value(新的大数据);.... 

我想知道为什么第一个版本不正确.

如果 std :: shared_future :: get()是一个非const成员函数,那将是有道理的,因为从多个线程访问单个 std :: shared_future 那么它本身就是一场数据竞赛.但是由于此成员函数被声明为const,并且 global_sf 对象与线程同步,因此可以安全地从多个线程并发访问.

我的问题是,为什么只有每个 worker_thread 都收到 std :: shared_future 的副本时,这才能保证正常工作?

解决方案

您使用单个全局 shared_future 的实现是完全可以的,即使有些不正常,这本书也可能是错误的.

>

[futures.shared_future]¶2

[注意: shared_future 的成员函数不与自己同步,但与共享状态同步.— 尾注]

注释不是规范性的,因此上面的描述多余地明确了一个事实,该事实已包含在规范性措辞中.

[介绍性种族]¶2

如果两个表达式评估其中一个修改了内存位置,而另一个表达式读取或修改了相同的内存位置,则发生冲突 .

¶6

某些库调用与另一个线程执行的其他库调用同步.

[...在与...同步方面,定义了之前的其他段落

¶19

如果两个动作是由不同的线程执行的,则两个动作是潜在的并发 ...如果程序包含两个可能的并发冲突的动作,则程序的执行将包含数据争用,至少其中一个不是原子的,而且都不会在另一个之前发生...

[res.on.data.races]¶3

C ++标准库函数不得直接或间接修改可由当前线程以外的线程访问的对象,除非通过函数的非常量参数(包括 this )直接或间接访问对象./p>

因此,我们知道在不同线程中对 global_sf.get()的调用可能是并发的,除非您将它们与其他同步(例如互斥锁)一起使用.但是我们也知道,在不同线程中对 global_sf.get()的调用不会冲突,因为它是 const 方法,因此禁止修改可从多个线程访问的对象,包括 * this .因此,无法满足数据竞争的定义(未排序,可能同时发生的冲突动作),该程序不包含数据竞争.

人们通常还是希望避免使用全局变量,但这是一个单独的问题.

请注意,如果这本书是正确的,则说明存在矛盾.它声称正确的代码仍包含全局 shared_future ,当多个线程创建其本地副本时,可以从多个线程访问它们:

  void worker_thread(){自动local_sf = global_sf;//<-此处对global_sf的非同步访问const bigdata * ptr = local_sf.get();...} 

A pointer to a data structure is shared with multiple threads via std::promise and std::shared_future. From the book 'C++ concurrency in action' by Anthony Williams (pg. 85-86), it seems that data is only correctly synchronized when each receiving thread uses a copy of the std::shared_future object as opposed to each thread accessing a single, global std::shared_future.

To illustrate, consider a thread creating bigdata and passing a pointer to multiple threads that have read-only access. If data synchronization between threads is not handled correctly, memory reordering may lead to undefined behavior (eg. a worker_thread reading incomplete data).

This (incorrect ?) implementation uses a single, global std::shared_future:

#include <future>

struct bigdata { ... };

std::shared_future<bigdata *> global_sf;

void worker_thread()
{
    const bigdata *ptr = global_sf.get();
    ...  // ptr read-only access
}

int main()
{
    std::promise<bigdata *> pr;
    global_sf = pr.get_future().share();

    std::thread t1{worker_thread};
    std::thread t2{worker_thread};

    pr.set_value(new bigdata);
    ...
}

And in this (correct) implementation, each worker_thread gets a copy of std::shared_future:

void worker_thread(std::shared_future<bigdata *> sf)
{
    const bigdata *ptr = sf.get();
    ...
}

int main()
{
    std::promise<bigdata *> pr;
    auto sf = pr.get_future().share();

    std::thread t1{worker_thread, sf};
    std::thread t2{worker_thread, sf};

    pr.set_value(new bigdata);
    ....

I am wondering why the first version is incorrect.

If std::shared_future::get() was a non-const member function, it would make sense since accessing a single std::shared_future from multiple threads would then be a data race itself. But since this member function is declared const, and the global_sf object is synchronized with the threads, it is safe to access concurrently from multiple threads.

My question is, why exactly is this only guaranteed to work correctly if each worker_thread receives a copy of the std::shared_future ?

解决方案

Your implementation using a single, global shared_future is completely fine, if slightly unusual, and the book appears to be mistaken.

[futures.shared_future] ¶2

[ Note: Member functions of shared_future do not synchronize with themselves, but they synchronize with the shared state. — end note ]

Notes are non-normative, so the above is redundantly making explicit a fact which is already implicit in the normative wording.

[intro.races] ¶2

Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location.

¶6

Certain library calls synchronize with other library calls performed by another thread.

[...Additional paragraphs defining happens before in terms of synchronizes with...]

¶19

Two actions are potentially concurrent if they are performed by different threads... The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is not atomic, and neither happens before the other...

[res.on.data.races] ¶3

A C++ standard library function shall not directly or indirectly modify objects accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function’s non-const arguments, including this.

So we know that calls to global_sf.get() in different threads are potentially concurrent unless you accompany them with additional synchronization (e.g. a mutex). But we also know that calls to global_sf.get() in different threads do not conflict, because it is a const method and hence forbidden from modifying objects accessible from multiple threads, including *this. So the definition of a data race (unsequenced, potentially concurrent conflicting actions) is not satisfied, the program does not contain a data race.

One would usually wish to avoid global variables anyway, but that is a separate issue.

Note that if the book is correct, then it contains a contradiction. The code which it claims is correct still contains a global shared_future which is accessed from multiple threads when they create their local copies:

void worker_thread()
{
    auto local_sf = global_sf; // <-- unsynchronized access of global_sf here

    const bigdata *ptr = local_sf.get();
    ...
}

这篇关于为什么需要多个shared_future对象来同步数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆