如何prevent腐败与原子比较和交换实现并发LIFO栈 [英] how to prevent corruption in concurrent lifo stack implemented with atomic compare and swap

查看:104
本文介绍了如何prevent腐败与原子比较和交换实现并发LIFO栈的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下面是一个简单的C程序演示,我用建于GNU实现并发堆栈有一个问题比较和交换上的Intel CPU。我花了一段时间来了解发生了什么事,但现在,我做我看到,它是深受提供的担保范围内的原子比较和交换。

Below is a simplified C program that demonstrates a problem I am having with a concurrent stack implemented using the GNU built in compare and swap on an intel cpu. It took me a while to understand what was happening, but now that I do I see that it is well within the guarantees provided by atomic compare and swap.

当一个节点被从堆栈,改性弹出,然后放置回栈上,修改的值可能在栈的新头,破坏它。在test_get的评论描述导致此事件发生的顺序。

When a node is popped from the stack, modified, then placed back on the stack, the modified value may become the new head of the stack, corrupting it. The comments in test_get describe the order of events that cause this.

有什么办法能够可靠地使用具有相同的堆栈在同一个节点不止一次?这是一个夸张的例子,但即使返回未修改节点gHead可能泄露其他节点。该数据结构的原始点是重复使用相同的节点。

Is there any way to reliably use the same node with the same stack more than once? This is an exaggerated example but even returning an unmodified node to gHead could leak other nodes. The original point of this data structure was to repeatedly use the same nodes.

typedef struct test_node {
    struct test_node *next;
    void *data;
} *test_node_p;

test_node_p gHead = NULL;
unsigned gThreadsDone = 0;

void test_put( test_node_p inMemory ) {
    test_node_p scratch;

    do {
        scratch = gHead;
        inMemory->next = scratch;
    } while ( !__sync_bool_compare_and_swap( &gHead , scratch , inMemory ) );
}

test_node_p test_get( void ) {
    test_node_p result;
    test_node_p scratch;

    do {
        result = gHead;
        if ( NULL == result ) break;
        //  other thread acquires result, modifies next
        scratch = result->next;     //  scratch is 0xDEFACED...
        //  other thread returns result to gHead
    } while ( !__sync_bool_compare_and_swap( &gHead , result , scratch ) );
    //  this thread corrupts gHead with 0xDEFACED... value

    if ( NULL == result ) {
        result = (test_node_p)malloc( sizeof(struct test_node) );
    }

    return result;
}

void *memory_entry( void *in ) {
    test_node_p node;
    int index , count = 1000;
    for ( index = 0 ; index < count ; ++index ) {
        node = test_get();
        *(uint64_t *)(node) |= 0xDEFACED000000000ULL;
        test_put( node );
    }

    __sync_add_and_fetch(&gThreadsDone,1);

    return NULL;
}

void main() {
    unsigned    index , count = 8;
    pthread_t   thread;

    gThreadsDone = 0;

    for ( index = 0 ; index < count ; ++index ) {
        pthread_create( &thread , NULL , memory_entry , NULL );
        pthread_detach( thread );
    }

    while ( __sync_add_and_fetch(&gThreadsDone,0) < count ) {}
}

我运行这个测试有8个逻辑核心。当我只用4个线程的问题很少出现,但有8个是容易复制。我有一个英特尔酷睿i7一台MacBook。

I am running this test with 8 logical cores. When I use only 4 threads the problem rarely occurs but with 8 it is easily reproducible. I have a MacBook with an Intel Core i7.

我不感兴趣,使用一些库或框架,已经解决了这个问题,我想知道它是如何解决的。谢谢你。

I am not interested in using some library or framework that has solved this, I want to know how it was solved. Thank you.

编辑:

下面是两个解决方案,不以我的情况下工作。

Here are two solutions that do not work in my case.

在某些平台提供的地址,而不是价值进行核试验LL / SC指令对。任何写地址,甚至相同的值,prevents成功

Some architectures provide ll/sc instruction pairs that perform atomic tests on the address instead of the value. Any write to the address, even of the same value, prevents success.

一些体系结构提供了比指针的大小比较和交换。由此,一个单调计数器被成对与被原子地递增每次使用这样的值总是变化时间的指针。某些Intel芯片都支持这一点,但没有GNU包装。

Some architectures provide larger than pointer size compare and swap. With this, a monotonic counter is paired with the pointer which is atomically incremented every time it is used so the value always changes. Some intel chips support this but there is no GNU wrapper.

下面是这个问题的玩一玩。两个线程,A和B,达到 test_get 的地步,他们有结果相同的值,这是不是 NULL 。然后发生下列序列:

Here is a play by play of the problem. Two threads, A and B, reach the point in test_get where they have the same value for result and it is not NULL. Then the following sequence occurs:


  1. 线程A传递比较和交换,并返回结果 test_get

  2. 线程A修改的结果内容

  3. 线程B指针引用结果,得到的任何线程A放在那里

  4. 线程A与结果键,通话 test_put
  5. 完成
  6. 线程A通过比较,并在 test_put 交换把结果返回的 gHead

  7. 线程B到达比较,并在 test_get 交换和传递

  8. 线程B已损坏 gHead 从线程A值

  1. Thread A passes the compare and swap and returns result from test_get
  2. Thread A modifies the content of result
  3. Thread B dereferences result, getting whatever thread A put there
  4. Thread A finishes with result and calls test_put
  5. Thread A passes the compare and swap in test_put putting result back on gHead
  6. Thread B reaches the compare and swap in test_get and passes
  7. Thread B has now corrupted gHead with the value from Thread A

下面是三个线程类似的场景,不需要线程A修改结果

Here is a similar scenario with three threads that does not require Thread A to modify result.


  1. 线程A传递比较和交换,并返回结果 test_get

  2. 发A没有修改结果内容

  3. 线程B指针引用结果从零开始得到一个有效的值

  4. 主题的C调用 test_put 与一个不相关的价值和成功

  5. 线程A调用 test_put 并把结果成功 gHead

  6. 线程B到达比较,并在 test_get 交换和传递

  7. 线程B已损坏 gHead 跑冒滴漏任何线程下加入

  1. Thread A passes the compare and swap and returns result from test_get
  2. Thread A does not modify the content of result
  3. Thread B dereferences result, getting a valid value in scratch
  4. Thread C calls test_put with an unrelated value and succeeds
  5. Thread A calls test_put and succeeds in putting result back on gHead
  6. Thread B reaches the compare and swap in test_get and passes
  7. Thread B has now corrupted gHead by leaking whatever thread C added

在这两种情况下,问题是,线程A传递比较和交换两次,一次用于放然后再拿到之前,线程B到达比较和交换的GET。任何值线程B有划痕应该被丢弃,而不是因为gHead值显示正确。

In either scenario, the problem is that thread A passes the compare and swap twice, once for get then again for put, before thread B reaches the compare and swap for get. Any value thread B has for scratch should be discarded, but is not because the value in gHead appears correct.

任何解决办法,使这个不太可能,而无需实际preventing它只是使这个bug很难追查。

Any solution that makes this less likely without actually preventing it just makes the bug harder to track down.

注意划伤变量只是地方原子指令开始之前被放置结果的提领值的名称。除去名称并除去解除引用之间的时间分片和比较可能会中断。

Note that the scratch variable is just a name for wherever the dereferenced value of result is placed before the atomic instruction begins. Removing the name does remove the time slice between dereference and compare that may be interrupted.

还要注意的是原子意味着它成功或失败作为一个整体。指针对准任意读取是有问题的硬件隐含原子。至于其他线程而言,存在这样的情况,只有一半的指针已被读出没有中断的点

Note also that atomic means it succeeds or fails as a whole. Any aligned read of a pointer is implicitly atomic on the hardware in question. As far as other threads are concerned, there is no interruptible point where only half the pointer has been read.

推荐答案

(我放弃我的previous答案。)

(I am discarding my previous answer.)

问题是,你不必原子读取机制 gHead gHead-&gt;接下来,但这样需要实现自己的锁免费栈。既然你打算在繁忙的循环反正对付比较和交换碰撞,您可以使用自旋锁等价的:

The issue is that you do not have a mechanism to atomically read gHead and gHead->next, but such is needed to achieve your lock free stack. Since you intend to busy loop anyway to deal with compare and swap collisions, you can use the equivalent of a spin lock:

void lock_get () {
    while (!_sync_bool_compare_and_swap(&gGetLock, 0, 1)) {}
}

void unlock_get () {
    unlock_get_success = _sync_bool_compare_and_swap(&gGetLock, 1, 0);
    assert(unlock_get_success);
}

现在,在 test_get()可以通过 lock_get() unlock_get()。 test_get的CAS循环()只是一个线程与百家争鸣test_put()。延执行CAS循环似乎更清洁,但。

Now, the loop in test_get() can be surrounded by lock_get() and unlock_get(). The CAS loop of test_get() is just one thread contending with test_put(). Jens' implementation of the CAS loop seems cleaner though.

lock_get();
result = __sync_val_compare_and_swap(&gHead, 0, 0);
while ((oldval = result)) {
   result = __sync_val_compare_and_swap(&gHead, result, result->next);
   if (oldval == result) break;
}
unlock_get();

此实现的意图,这是只有一个线程应该突然离开头

This implements the intention, which is that only one thread should be popping off the head.

这篇关于如何prevent腐败与原子比较和交换实现并发LIFO栈的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆