障碍如何能尽快pthread_barrier_wait回报是销毁的? [英] How can barriers be destroyable as soon as pthread_barrier_wait returns?

查看:606
本文介绍了障碍如何能尽快pthread_barrier_wait回报是销毁的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此问题的依据是:

何时才能安全销毁一个pthread的障碍?

和最近的glibc的bug报告:

and the recent glibc bug report:

<一个href=\"http://sourceware.org/bugzilla/show_bug.cgi?id=12674\">http://sourceware.org/bugzilla/show_bug.cgi?id=12674

我不知道有关的glibc报告的信号量的问题,而是presumably它应该是有效的,尽快摧毁一个障碍,因为 pthread_barrier_wait 的回报,如按照上述链接的问题。 (通常情况下,得到了螺纹 PTHREAD_BARRIER_SERIAL_THREAD ,或一个特殊的线程已经认为自己负责为屏障的对象,将是摧毁它的人。)的主要使用情况下,我能想到的是,当一个屏障用来同步一个新线程的创建线程的堆栈中使用的数据,preventing无法返回,直到新的线程得到使用这些数据创建线程;其他障碍可能有一个一生等于整个程序,或通过一些其他的同步对象的控制。

I'm not sure about the semaphores issue reported in glibc, but presumably it's supposed to be valid to destroy a barrier as soon as pthread_barrier_wait returns, as per the above linked question. (Normally, the thread that got PTHREAD_BARRIER_SERIAL_THREAD, or a "special" thread that already considered itself "responsible" for the barrier object, would be the one to destroy it.) The main use case I can think of is when a barrier is used to synchronize a new thread's use of data on the creating thread's stack, preventing the creating thread from returning until the new thread gets to use the data; other barriers probably have a lifetime equal to that of the whole program, or controlled by some other synchronization object.

在任何情况下,如何可以实现确保屏障的破坏(甚至可能取消映射它驻留在内存中)是安全的,只要 pthread_barrier_wait 中的任何线程返回?似乎尚未返回将需要检查至少一些阻挡物的一部分,以完成他们的工作,并返回其他线程,很像如何,在glibc的错误报告上面引用, sem_post 有权审查已经调整了信号量值后,服务员数。

In any case, how can an implementation ensure that destruction of the barrier (and possibly even unmapping of the memory it resides in) is safe as soon as pthread_barrier_wait returns in any thread? It seems the other threads that have not yet returned would need to examine at least some part of the barrier object to finish their work and return, much like how, in the glibc bug report cited above, sem_post has to examine the waiters count after having adjusted the semaphore value.

推荐答案

我会采取另一种在裂缝用一个例子实施的pthread_barrier_wait()使用互斥和条件变量的功能可能被pthreads实现来提供。注意,这个例子并不试图处理性能方面的考虑(具体而言,在等待线程畅通,他们都是重新序列退出等待时)。我认为,使用类似Linux的futex的对象可以与性能问题有帮助,但futexes的是,还有很多pretty出我的经验。

I'm going to take another crack at this with an example implementation of pthread_barrier_wait() that uses mutex and condition variable functionality as might be provided by a pthreads implementation. Note that this example doesn't try to deal with performance considerations (specifically, when the waiting threads are unblocked, they are all re-serialized when exiting the wait). I think that using something like Linux Futex objects could help with the performance issues, but Futexes are still pretty much out of my experience.

此外,我怀疑这个例子处理信号或错误正确(如果所有的信号的情况下)。但我认为那些东西适当的支持,可以添加为读者练习。

Also, I doubt that this example handles signals or errors correctly (if at all in the case of signals). But I think proper support for those things can be added as an exercise for the reader.

我主要担心的是,例如可能有竞争条件或死锁(互斥的处理更加复杂,比我喜欢)。还注意到,这是未连被编译的例子。把它当作伪code。也请记住,我的经验,主要是在Windows - 我处理这更是比什么都重要的教育机会。因此,伪code的质量可能是pretty低。

My main fear is that the example may have a race condition or deadlock (the mutex handling is more complex than I like). Also note that it is an example that hasn't even been compiled. Treat it as pseudo-code. Also keep in mind that my experience is mainly in Windows - I'm tackling this more as an educational opportunity than anything else. So the quality of the pseudo-code may well be pretty low.

不过,免责声明之外,我觉得可能提供如何在这个问题问的问题可以处理的想法(即,如何能在 pthread_barrier_wait()功能允许它使用 pthread_barrier_t 目的是由任何所释放的线程,而不对它们的出路)使用了阻隔物体由一个或多个线程的危险被破坏。

However, disclaimers aside, I think it may give an idea of how the problem asked in the question could be handled (ie., how can the pthread_barrier_wait() function allow the pthread_barrier_t object it uses to be destroyed by any of the released threads without danger of using the barrier object by one or more threads on their way out).

这里所说:

/* 
 *  Since this is a part of the implementation of the pthread API, it uses
 *  reserved names that start with "__" for internal structures and functions
 *
 *  Functions such as __mutex_lock() and __cond_wait() perform the same function
 *  as the corresponding pthread API.
 */

// struct __barrier_wait data is intended to hold all the data
//  that `pthread_barrier_wait()` will need after releasing
//  waiting threads.  This will allow the function to avoid
//  touching the passed in pthread_barrier_t object after 
//  the wait is satisfied (since any of the released threads
//   can destroy it)

struct __barrier_waitdata {
    struct __mutex cond_mutex;
    struct __cond cond;

    unsigned waiter_count;
    int wait_complete;
};

struct __barrier {
    unsigned count;

    struct __mutex waitdata_mutex;
    struct __barrier_waitdata* pwaitdata;
};

typedef struct __barrier pthread_barrier_t;



int __barrier_waitdata_init( struct __barrier_waitdata* pwaitdata)
{
    waitdata.waiter_count = 0;
    waitdata.wait_complete = 0;

    rc = __mutex_init( &waitdata.cond_mutex, NULL);
    if (!rc) {
        return rc;
    }

    rc = __cond_init( &waitdata.cond, NULL);
    if (!rc) {
        __mutex_destroy( &pwaitdata->waitdata_mutex);
        return rc;
    }

    return 0;
}




int pthread_barrier_init(pthread_barrier_t *barrier, const pthread_barrierattr_t *attr, unsigned int count)
{
    int rc;

    result = __mutex_init( &barrier->waitdata_mutex, NULL);
    if (!rc) return result;

    barrier->pwaitdata = NULL;
    barrier->count = count;

    //TODO: deal with attr
}



int pthread_barrier_wait(pthread_barrier_t *barrier)
{
    int rc;
    struct __barrier_waitdata* pwaitdata;
    unsigned target_count;

    // potential waitdata block (only one thread's will actually be used)
    struct __barrier_waitdata waitdata; 

    // nothing to do if we only need to wait for one thread...
    if (barrier->count == 1) return PTHREAD_BARRIER_SERIAL_THREAD;

    rc = __mutex_lock( &barrier->waitdata_mutex);
    if (!rc) return rc;

    if (!barrier->pwaitdata) {
        // no other thread has claimed the waitdata block yet - 
        //  we'll use this thread's

        rc = __barrier_waitdata_init( &waitdata);
        if (!rc) {
            __mutex_unlock( &barrier->waitdata_mutex);
            return rc;
        }

        barrier->pwaitdata = &waitdata;
    }

    pwaitdata = barrier->pwaitdata;
    target_count = barrier->count;

    //  all data necessary for handling the return from a wait is pointed to
    //  by `pwaitdata`, and `pwaitdata` points to a block of data on the stack of
    //  one of the waiting threads.  We have to make sure that the thread that owns
    //  that block waits until all others have finished with the information
    //  pointed to by `pwaitdata` before it returns.  However, after the 'big' wait
    //  is completed, the `pthread_barrier_t` object that's passed into this 
    //  function isn't used. The last operation done to `*barrier` is to set 
    //  `barrier->pwaitdata = NULL` to satisfy the requirement that this function
    //  leaves `*barrier` in a state as if `pthread_barrier_init()` had been called - and
    //  that operation is done by the thread that signals the wait condition 
    //  completion before the completion is signaled.

    // note: we're still holding  `barrier->waitdata_mutex`;

    rc = __mutex_lock( &pwaitdata->cond_mutex);
    pwaitdata->waiter_count += 1;

    if (pwaitdata->waiter_count < target_count) {
        // need to wait for other threads

        __mutex_unlock( &barrier->waitdata_mutex);
        do {
            // TODO:  handle the return code from `__cond_wait()` to break out of this
            //          if a signal makes that necessary
            __cond_wait( &pwaitdata->cond,  &pwaitdata->cond_mutex);
        } while (!pwaitdata->wait_complete);
    }
    else {
        // this thread satisfies the wait - unblock all the other waiters
        pwaitdata->wait_complete = 1;

        // 'release' our use of the passed in pthread_barrier_t object
        barrier->pwaitdata = NULL;

        // unlock the barrier's waitdata_mutex - the barrier is  
        //  ready for use by another set of threads
        __mutex_unlock( barrier->waitdata_mutex);

        // finally, unblock the waiting threads
        __cond_broadcast( &pwaitdata->cond);
    }

    // at this point, barrier->waitdata_mutex is unlocked, the 
    //  barrier->pwaitdata pointer has been cleared, and no further 
    //  use of `*barrier` is permitted...

    // however, each thread still has a valid `pwaitdata` pointer - the 
    // thread that owns that block needs to wait until all others have 
    // dropped the pwaitdata->waiter_count

    // also, at this point the `pwaitdata->cond_mutex` is locked, so
    //  we're in a critical section

    rc = 0;
    pwaitdata->waiter_count--;

    if (pwaitdata == &waitdata) {
        // this thread owns the waitdata block - it needs to hang around until 
        //  all other threads are done

        // as a convenience, this thread will be the one that returns 
        //  PTHREAD_BARRIER_SERIAL_THREAD
        rc = PTHREAD_BARRIER_SERIAL_THREAD;

        while (pwaitdata->waiter_count!= 0) {
            __cond_wait( &pwaitdata->cond, &pwaitdata->cond_mutex);
        };

        __mutex_unlock( &pwaitdata->cond_mutex);
        __cond_destroy( &pwaitdata->cond);
        __mutex_destroy( &pwaitdata_cond_mutex);
    }
    else if (pwaitdata->waiter_count == 0) {
        __cond_signal( &pwaitdata->cond);
        __mutex_unlock( &pwaitdata->cond_mutex);
    }

    return rc;
}


7月17日20111:更新在回答有关进程共享的障碍评论/问题


17 July 20111: Update in response to a comment/question about process-shared barriers

我完全忘了与那些进程间共享壁垒的情况。正如你提到的,我概括的想法会可怕在这种情况下失败。我真的不与POSIX共享内存的使用经验,所以我做任何建议的应该持怀疑态度回火

I forgot completely about the situation with barriers that are shared between processes. And as you mention, the idea I outlined will fail horribly in that case. I don't really have experience with POSIX shared memory use, so any suggestions I make should be tempered with scepticism.

要总结(为我好,如果没有其他人的):

To summarize (for my benefit, if no one else's):

在任何线程后获得控制pthread_barrier_wait()的回报,屏障对象必须在初始化状态(但是,最近的 pthread_barrier_init()在该对象设置)。也由API暗示的是,一旦任何线程返回,一种或多种的以下的事情可能发生:

When any of the threads gets control after pthread_barrier_wait() returns, the barrier object needs to be in the 'init' state (however, the most recent pthread_barrier_init() on that object set it). Also implied by the API is that once any of the threads return, one or more of the the following things could occur:


  • pthread_barrier_wait另一个呼叫()来开始新一轮的线程同步

  • pthread_barrier_destroy()阻隔对象

  • 分配给屏障对象的内存可以被释放或取消共享,如果它在一个共享内存区域。

  • another call to pthread_barrier_wait() to start a new round of synchronization of threads
  • pthread_barrier_destroy() on the barrier object
  • the memory allocated for the barrier object could be freed or unshared if it's in a shared memory region.

这些东西意味着 pthread_barrier_wait前()调用允许的任何的线程返回,pretty多少需求,以确保所有等待的线程都不再使用了阻隔对象在调用的上下文。我的第一个答案通过创建同步对象的屏障对象,将阻止所有线程之外的局部集(互斥和相关的条件变量)解决了这个。这些本地同步对象被分配的线程所发生的调用堆栈上的 pthread_barrier_wait()第一。

These things mean that before the pthread_barrier_wait() call allows any thread to return, it pretty much needs to ensure that all waiting threads are no longer using the barrier object in the context of that call. My first answer addressed this by creating a 'local' set of synchronization objects (a mutex and an associated condition variable) outside of the barrier object that would block all the threads. These local synchronization objects were allocated on the stack of the thread that happened to call pthread_barrier_wait() first.

我认为,类似的东西需要的是进程共享的障碍来完成。然而,在这种情况下,简单地在一个线程的堆栈分配的那些同步对象是不充分的(因为其它过程将没有访问)。对于进程共享的障碍,这些对象将在进程共享内存来进行分配。我觉得我上面列出的技术可以同样适用:

I think that something similar would need to be done for barriers that are process-shared. However, in that case simply allocating those sync objects on a thread's stack isn't adequate (since the other processes would have no access). For a process-shared barrier, those objects would have to be allocated in process-shared memory. I think the technique I listed above could be applied similarly:


  • waitdata_mutex 已经凭借其在屏障是的控制局部同步变量分配(该waitdata块)将在进程共享内存结构。当然,当屏障被设置为 THEAD_PROCESS_SHARED ,该属性也需要被施加到 waitdata_mutex

  • __ barrier_waitdata_init()来初始化本地互斥&安培;条件变量,那就要分配,而不是简单地使用基于堆栈的 waitdata 变量在共享内存中的对象。

  • 当清理线程破坏互斥锁,并在 waitdata 块中的条件变量,那就还需要清理进程共享的内存分配块。

  • 在使用共享存储器的情况下,需要有某种机制来确保该共享存储​​器对象中的每个进程打开至少一次,并且在每个过程结束的正确的次数(而不是之前完全关闭在这个过程中,每个线程使用它)结束。我没有想到过究竟如何,将做...

  • the waitdata_mutex that controls the 'allocation' of the local sync variables (the waitdata block) would be in process-shared memory already by virtue of it being in the barrier struct. Of course, when the barrier is set to THEAD_PROCESS_SHARED, that attribute would also need to be applied to the waitdata_mutex
  • when __barrier_waitdata_init() is called to initialize the local mutex & condition variable, it would have to allocate those objects in shared memory instead of simply using the stack-based waitdata variable.
  • when the 'cleanup' thread destroys the mutex and the condition variable in the waitdata block, it would also need to clean up the process-shared memory allocation for the block.
  • in the case where shared memory is used, there needs to be some mechanism to ensured that the shared memory object is opened at least once in each process, and closed the correct number of times in each process (but not closed entirely before every thread in the process is finished using it). I haven't thought through exactly how that would be done...

我觉得这些变化将使计划与进程共享的障碍进行操作。上面的最后一个要点是要弄清楚一个关键项目。另一个是如何为共享内存对象,将举行本地进程共享 waitdata 构建一个名字。有你想要的某些属性为名称:

I think these changes would allow the scheme to operate with process-shared barriers. the last bullet point above is a key item to figure out. Another is how to construct a name for the shared memory object that will hold the 'local' process-shared waitdata. There are certain attributes you'd want for that name:


  • 您会想存储的名称驻留在结构pthread_barrier_t 结构,使所有的过程都可以访问它;这意味着一个已知限制的名称的长度

  • 你想要的名称是唯一的每一套呼叫的实例来 pthread_barrier_wait(),因为它可能会进行第二轮的等待是可​​能的启动所有线程都得到了所有的出路在第一轮等待之前(这样进程共享的内存块设置为 waitdata 可能没有被释放尚未)。所以,名字大概有根据的东西像进程ID,线程ID,屏障对象的地址和原子计数。

  • 我不知道是否有对具有名称是猜测的安全隐患。如果是这样,某些随机化需要被添加 - 不知道多少。也许你还就需要哈希与随机比特沿上述数据。就像我说的,我真的不知道,如果这是很重要与否。

  • you'd want the storage for the name to reside in the struct pthread_barrier_t structure so all process have access to it; that means a known limit to the length of the name
  • you'd want the name to be unique to each 'instance' of a set of calls to pthread_barrier_wait() because it might be possible for a second round of waiting to start before all threads have gotten all the way out of the first round waiting (so the process-shared memory block set up for the waitdata might not have been freed yet). So the name probably has to be based on things like process id, thread id, address of the barrier object, and an atomic counter.
  • I don't know whether or not there are security implications to having the name be 'guessable'. if so, some randomization needs to be added - no idea how much. Maybe you'd also need to hash the data mentioned above along with the random bits. Like I said, I really have no idea if this is important or not.

这篇关于障碍如何能尽快pthread_barrier_wait回报是销毁的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆