锁定自由队列 - 单个生产者,多个消费者 [英] Lock Free Queue -- Single Producer, Multiple Consumers

查看:184
本文介绍了锁定自由队列 - 单个生产者,多个消费者的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种方法来实现支持单个生产者和多个消费者的无锁队列数据结构。我看过Maged Michael和Michael Scott(1996)的经典方法,但是他们的版本使用链表。我想要一个实现,使用有界循环缓冲区。使用原子变量的东西?



在旁注中,我不知道为什么这些经典方法是为需要大量动态内存管理的链表设计的。在多线程程序中,所有存储器管理例程都被序列化。我们不是通过将它们与动态数据结构结合使用来击败无锁方法的好处吗?



我想使用pthread库在C / C ++中编码

解决方案



使用循环缓冲区会使锁定变得必要,因为需要阻止以防止磁头通过尾部。但是否则头和尾指针可以容易地被原子地更新。或者在某些情况下,缓冲区可能很大,覆盖不是问题。 (在现实生活中,你会在自动交易系统中看到这种情况,循环缓冲区大小可以容纳X分钟的市场数据。如果你在X分钟后,你的问题比重写缓冲区要糟糕得多。



当我在C ++中实现MS队列时,我使用堆栈构建了一个无锁分配器,这很容易实现。如果我有MSQueue,那么在编译时我知道sizeof(MSQueue :: node)。然后我创建一个所需大小的N个缓冲区的堆栈。 N可以增长,即如果pop()返回null,很容易去请求堆更多的块,并且这些被推送到堆栈。在可能阻塞的调用之外的更多内存,这是一个无锁操作。



请注意,T不能有一个非平凡的dtor。我工作的版本,确实允许非平凡的dtors,实际上工作。但是我发现,只要使T指向我想要的T,生产者释放所有权,消费者获得所有权就更容易了。这当然要求T本身使用lockfree方法分配,但是我使用堆栈做的相同的分配器也在这里工作。



在任何情况下,锁点 - 免费编程不是数据结构本身更慢。关键是:


  1. lock free让我独立于调度器。基于锁的编程取决于调度程序,以确保锁的持有者正在运行,以便他们可以释放锁。这是什么原因导致优先级倒置在Linux上有一些锁属性,以确保这发生

  2. 如果我独立于调度程序,操作系统有更容易的时间管理时间片,我非常少上下文切换

  3. 它更容易编写正确的多线程程序使用lockfree方法,因为我不必担心死锁,活锁,调度,同步化等这是尤其是真正的共享内存实现,其中一个进程可能死在共享内存中持有锁,并且没有办法释放锁

  4. 无锁方法更容易扩展。事实上,我已经通过网络使用消息传递实现无锁方法。这样的分布式锁是一个噩梦

也就是说,有很多情况下基于锁的方法是首选和/当更新昂贵或不可能复制的内容时,p>


  1. 。大多数免锁定方法使用某种版本,即制作对象的副本,更新它,并检查共享版本是否仍然与复制它时相同,然后使当前版本更新版本。 Els再次复制它,应用更新,然后再次检查。继续这样做,直到它工作。这是很好的,当对象很小,但它们很大,或包含文件句柄等,然后不推荐

  2. 大多数类型是不可能以无锁的方式访问,例如。任何STL容器。这些具有需要非原子访问的不变量,例如assert(vector.size()== vector.end() - vector.begin())。因此,如果您要更新/读取共享的向量,您必须将其锁定。


I am looking for a method to implement lock-free queue data structure that supports single producer, and multiple consumers. I have looked at the classic method by Maged Michael and Michael Scott (1996) but their version uses linked lists. I would like an implementation that makes use of bounded circular buffer. Something that uses atomic variables?

On a side note, I am not sure why these classic methods are designed for linked lists that require a lot of dynamic memory management. In a multi-threaded program, all memory management routines are serialized. Aren't we defeating the benefits of lock-free methods by using them in conjunction with dynamic data structures?

I am trying to code this in C/C++ using pthread library on a Intel 64-bit architecture.

Thank you, Shirish

解决方案

The use of a circular buffer makes a lock necessary, since blocking is needed to prevent the head from going past the tail. But otherwise the head and tail pointers can easily be updated atomically. Or in some cases the buffer can be so large that overwriting is not an issue. (in real life you will see this in automated trading systems, with circular buffers sized to hold X minutes of market data. If you are X minutes behind, you have wayyyy worse problems than overwriting your buffer).

When I implemented the MS queue in C++, I built a lock free allocator using a stack, which is very easy to implement. If I have MSQueue then at compile time I know sizeof(MSQueue::node). Then I make a stack of N buffers of the required size. The N can grow, i.e. if pop() returns null, it is easy to go ask the heap for more blocks, and these are pushed onto the stack. Outside of the possibly blocking call for more memory, this is a lock free operation.

Note that the T cannot have a non-trivial dtor. I worked on a version that did allow for non-trivial dtors, that actually worked. But I found that it was easier just to make the T a pointer to the T that I wanted, where the producer released ownership, and the consumer acquired ownership. This of course requires that the T itself is allocated using lockfree methods, but the same allocator I made with the stack works here as well.

In any case the point of lock-free programming is not that the data structures themselves are slower. The points are this:

  1. lock free makes me independent of the scheduler. Lock-based programming depends on the scheduler to make sure that the holders of a lock are running so that they can release the lock. This is what causes "priority inversion" On Linux there are some lock attributes to make sure this happens
  2. If I am independent of the scheduler, the OS has a far easier time managing timeslices, and I get far less context switching
  3. it is easier to write correct multithreaded programs using lockfree methods since I dont have to worry about deadlock , livelock, scheduling, syncronization, etc This is espcially true with shared memory implementations, where a process could die while holding a lock in shared memory, and there is no way to release the lock
  4. lock free methods are far easier to scale. In fact, I have implemented lock free methods using messaging over a network. Distributed locks like this are a nightmare

That said, there are many cases where lock-based methods are preferable and/or required

  1. when updating things that are expensive or impossible to copy. Most lock free methods use some sort of versioning, i.e. make a copy of the object, update it, and check if the shared version is still the same as when you copied it, then make the current version you update version. Els ecopy it again, apply the update, and check again. Keep doing this until it works. This is fine when the objects are small, but it they are large, or contain file handles, etc then not recommended
  2. Most types are impossible to access in a lock free way, e.g. any STL container. These have invariants that require non atomic access , for example assert(vector.size()==vector.end()-vector.begin()). So if you are updating/reading a vector that is shared, you have to lock it.

这篇关于锁定自由队列 - 单个生产者,多个消费者的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆