无锁队列算法,重复读取保持一致性 [英] Lock-free queue algorithm, repeated reads for consistency

查看:52
本文介绍了无锁队列算法,重复读取保持一致性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究 无锁(en-,de-)迈克尔和斯科特的排队算法.问题是我无法解释/理解(除了代码本身的注释之外,论文也无法解释)几行.

I'm studying the lock-free (en-,de-)queue algorithms of Michael and Scott. The problem is I can't explain/understand (nor the paper does, apart from the comments in the code itself) a couple of lines.

入队:

  enqueue(Q: pointer to queue_t, value: data type)
   E1:   node = new_node()        // Allocate a new node from the free list
   E2:   node->value = value      // Copy enqueued value into node
   E3:   node->next.ptr = NULL    // Set next pointer of node to NULL
   E4:   loop                     // Keep trying until Enqueue is done
   E5:      tail = Q->Tail        // Read Tail.ptr and Tail.count together
   E6:      next = tail.ptr->next // Read next ptr and count fields together
   E7:      if tail == Q->Tail    // Are tail and next consistent?
               // Was Tail pointing to the last node?
   E8:         if next.ptr == NULL
                  // Try to link node at the end of the linked list
   E9:            if CAS(&tail.ptr->next, next, <node, next.count+1>)
  E10:               break        // Enqueue is done.  Exit loop
  E11:            endif
  E12:         else               // Tail was not pointing to the last node
                  // Try to swing Tail to the next node
  E13:            CAS(&Q->Tail, tail, <next.ptr, tail.count+1>)
  E14:         endif
  E15:      endif
  E16:   endloop
         // Enqueue is done.  Try to swing Tail to the inserted node
  E17:   CAS(&Q->Tail, tail, <node, tail.count+1>)

为什么需要 E7?正确性取决于它吗?还是仅仅是一种优化?如果另一个线程成功执行 E17 或下面的 D10(并更改 Q->Tail),而第一个线程已执行 E5 但尚未执行 E7,则此 if 可能会失败.但是如果在第一个线程执行 E7 之后立即执行 E17 呢?

Why is E7 needed? Does correctness depend on it? Or is it merely an optimization? This if can fail if another thread successfully executed E17, or D10 below, (and changed Q->Tail) while the first thread has executed E5 but not yet E7. But what if E17 is executed right after the first thread executes E7?

edit:这最后一句话是否证明 E7 只能是一种优化?我的直觉是它确实,因为我给出的场景是显然"if 语句会做出错误的决定,但算法仍然应该正常工作.但是我们可以用随机条件替换 if 的条件,而不影响正确性.这个论点有什么漏洞吗?

edit: Does this last sentence prove that E7 cannot be more than an optimization? My intuition is that it does, since I give a scenario were "apparently" the if statement would make the wrong decision, yet the algorithm would still be supposed to work correctly. But then we could replace the if's condition with a random condition, without affecting correctness. Any hole in this argument?

出队:

dequeue(Q: pointer to queue_t, pvalue: pointer to data type): boolean
   D1:   loop                          // Keep trying until Dequeue is done
   D2:      head = Q->Head             // Read Head
   D3:      tail = Q->Tail             // Read Tail
   D4:      next = head.ptr->next      // Read Head.ptr->next
   D5:      if head == Q->Head         // Are head, tail, and next consistent?
   D6:         if head.ptr == tail.ptr // Is queue empty or Tail falling behind?
   D7:            if next.ptr == NULL  // Is queue empty?
   D8:               return FALSE      // Queue is empty, couldn't dequeue
   D9:            endif
                  // Tail is falling behind.  Try to advance it
  D10:            CAS(&Q->Tail, tail, <next.ptr, tail.count+1>)
  D11:         else                    // No need to deal with Tail
                  // Read value before CAS
                  // Otherwise, another dequeue might free the next node
  D12:            *pvalue = next.ptr->value
                  // Try to swing Head to the next node
  D13:            if CAS(&Q->Head, head, <next.ptr, head.count+1>)
  D14:               break             // Dequeue is done.  Exit loop
  D15:            endif
  D16:         endif
  D17:      endif
  D18:   endloop
  D19:   free(head.ptr)                // It is safe now to free the old node
  D20:   return TRUE                   // Queue was not empty, dequeue succeeded

再说一次,为什么需要 D5?正确性还是优化?我不确定这些测试给出的一致性"是什么,因为在 if 成功之后,它们似乎会立即变得不一致.

Again, why D5 is needed? Correctness or optimization? I'm not sure what "consistency" these tests give, since it seems they can get inconsistent right after the if succeeds.

这看起来像是一种标准技术.有人能解释一下背后的动机吗?对我来说,似乎目的是避免在少数情况下执行(昂贵的)CAS,可以注意到它肯定会失败,但代价是总是进行额外的读取,这不应该太多本身更便宜(例如,在 Java 中,Q->Tail 将被要求是 volatile,所以我们知道我们不仅仅是读取缓存在寄存器中的副本,而是读取真实的东西,这将是翻译为在阅读前加上某种栅栏),所以我不确定这里到底发生了什么......谢谢.

This looks like a standard technique. Can someone explain the motivation behind it? To me, it seems like the intention is to avoid doing an (expensive) CAS in those few cases it can be noticed that it would definitely fail, but at the cost of always doing an extra read, which is not supposed to be so much cheaper itself (e.g. in Java, Q->Tail would be required to be volatile, so we would know we are not merely reading a copy cached in a register but reading the real thing, which would be translated in prepending the read with a fence of some sort), so I'm not sure what's really going on here... thanks.

edit 这已被移植到 Java,更准确地说是在 ConcurrentLinkedQueue,例如E7 是第 194 行,而 D5 是第 212 行.

edit This has been ported to Java, more precisely in ConcurrentLinkedQueue, e.g. E7 is line 194, while D5 is line 212.

推荐答案

为什么需要 E7?

Why is E7 needed?

更多是为了优化.

考虑两个线程试图同时入队.它们都到达 E5,但在线程 1 到达 E7 之前,线程 2 成功排队.当线程 1 到达 E7 时,它将观察者 t == tail 为假然后重试.这将避免昂贵的 CAS.当然,这不是完全证明,因为 E7 可以在线程 2 入队之前成功并最终使 CAS 失败并且无论如何都必须重试.

Consider two threads trying to enqueue at the same time. They all get to E5 but before thread 1 gets to E7 thread 2 successfully queues. When thread 1 gets to E7 it will observer t == tail to be false then retries. This will avoid a costly CAS. Of course its not full proof because E7 can succeed before thread 2 enqueues and eventually fails the CAS and has to retry anyway.

为什么需要 D5

类似于 D5

同样,没有 E7 和 D5 的两个函数都可以工作.可能正在进行一些基准测试,并发现在适度争用的情况下,双重检查会增加吞吐量(这更多是一种观察,而不是事实).

Again, both functions without E7 and D5 would work. There was probably some benchmarking going on and found that under moderate contention the double check increases throughput (this is more of an observation and less of fact).

我去读了更多关于这个队列的论文.还检查无锁算法的正确性和较少的数据结构状态.

I went and read the paper on this queue a bit more. The check is also there for correctness of a lock free algorithm and less of the data structure's state.

无锁算法是非阻塞的,因为如果有尝试执行操作的非延迟进程队列中,一个操作保证在有限的时间.入队操作仅在条件符合E7 失败,E8 行中的条件失败,或者比较并且在 E9 行中交换失败.出队操作循环仅当 D5 行中的条件失败时,行中的条件D6 持有(且队列不为空),或者比较并且在 D13 行中交换失败.我们通过显示该算法是非阻塞的只有当一个过程循环超过有限次数时另一个进程完成对队列的操作.

The lock-free algorithm is non-blocking because if there are non-delayed processes attempting to perform operations on the queue, an operation is guaranteed to complete within finite time. An enqueue operation loops only if the condition in line E7 fails, the condition in line E8 fails, or the compare and swap in line E9 fails. A dequeue operation loops only if the condition in line D5 fails, the condition in line D6 holds (and the queue is not empty), or the compare and swap in line D13 fails. We show that the algorithm is non-blocking by showing that a process loops beyond a finite number of times only if another process completes an operation on the queue.

http://www.cs.rochester.edu/u/scott/papers/1996_PODC_queues.pdf

这篇关于无锁队列算法,重复读取保持一致性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆