哪些分布式消息队列支持数百万个队列? [英] What distributed message queues support millions of queues?

查看:154
本文介绍了哪些分布式消息队列支持数百万个队列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一个分布式消息队列,该队列将支持数百万个队列,每个队列每秒处理数十个消息.

I'm looking for a distributed message queue that will support millions of queues, with each queue handling tens of messages per second.

消息将很小(数十个字节),并且我不希望队列变得很长,每个队列最多只能有几十个消息,但是当系统嗡嗡作响时,队列应该留空.

The messages will be small (tens of bytes), and I don't expect the queues to get very long--on the order of tens of messages per queue at maximum, but when the system is humming along, the queues should stay fairly empty.

我不确定集群中期望有多少个节点-可能取决于特定的解决方案,但是如果我不得不猜测的话,我会说十个节点.我希望队列对于集群中的单个节点故障具有相对的弹性,但是在这里和那里丢失了一些消息不会使我失去睡眠.

I'm not sure how many nodes to expect in the cluster--probably depends on the specific solution, but if I had to guess, I would say ten nodes. I would prefer that queues were relatively resilient to individual node failures within the cluster, but a few lost messages here and there won't make me lose sleep.

是否存在这样的消息队列?似乎大多数领域都针对处理具有高吞吐量的数百个队列进行了优化.但是,SQS是基于什么构建的?当然不是魔术.

Does such a message queue exist? Seems like most of the field is optimized toward handling hundreds of queues with high throughput. But what is SQS built on? Surely not magic.

更新:

根据要求,可能确实有助于阐明我的问题领域. (为了避免使水浑浊,我之前省略了细节.)我正在试验分布式元胞自动机,最初的目标是模拟一百万个细胞.在某些CA模型中,添加事件模型非常有用,以便单元可以将事件发送到其邻居.因此,有一百万个队列,每个队列有一个消费者和八个左右的生产者.

By request, it may indeed help to shed light on my problem domain. (I'd left details out before so as not to muddy the waters.) I'm experimenting with distributed cellular automata, with an initial target of a million cells in simulation. In some CA models, it's useful to add an event model, so that a cell can send events to its neighbors. Hence, a million queues, each with one consumer and 8 or so producers.

现在,成本是一个问题,因为我自己为实验提供资金. (因此,Amazon的SQS可能无法使用.)

Costs are a concern for now, as I'm funding the experiments myself. (Thus Amazon's SQS is probably out of reach.)

推荐答案

根据您的描述,它看起来像 OMG的数据分发服务可能是一个很好的选择.它与消息队列技术有关,但是我宁愿称其为分布式数据管理基础结构.它是完全分布式的,并支持高级功能,这些功能通过一组丰富的服务质量"设置,使您可以很好地控制数据的分发方式.

From your description, it looks like OMG's Data Distribution Service could be a good fit. It is related to message queueing technologies, but I would rather call it a distributed data management infrastructure. It is completely distributed and supports advanced features that give you a lot of control over how the data is distributed, by means of a rich set of Quality of Service settings.

由于对您的问题了解不多,我可以猜测是哪种方法. DDS是关于将强类型数据项的状态分发为具有类型属性的结构.您可以创建描述自动机状态的数据类型.它的属性之一可以是唯一标识系统中自动机的ID.如果可能,将根据一种方案进行分配,以使每个自动机都知道其邻居的ID(如果存在).每个自动机将根据需要发布其状态,从而产生一个包含所有自动机当前状态的分布式数据空间. DDS支持该数据空间的所谓分区.如果您利用了这一点,那么计算机中的每个节点都将负责定义所有自动机的子集.有线通信只会发生在与其他分区相邻的那些自动机上.由于自动机知道邻居的ID,因此他们可以在数据空间中查询感兴趣的自动机的状态.

Not knowing much about your problem, I could guess what an approach might be. DDS is about distributing the state of strongly-typed data-items, as structures with typed attributes. You could create a data-type describing the state of an automaton. One of its attributes could be an ID uniquely identifying the automaton in the system. If possible, that would be assigned according to a scheme such that every automaton knows what the ID's of its neighbors are (if they are present). Each automaton would publish its state as needed, resulting in a distributed data-space containing the current state of all automatons. DDS supports so-called partitioning of that data-space. If you took advantage of that, then each of the nodes in your machine would be responsible for a well-defined subset of all automatons. Communication over the wire would only happen for those automatons neighboring a different partition. Since automatons know the ID's of their neighbors, they would be able to query the data-space for the states of the automatons it's interested in.

没有白板很难解释,但是最终结果对于大多数自动机将是单个实例(这是一种非常轻量级的消息队列),而对于那些自动机则是两个或三个实例在分区的边界.如果您有十个节点和一百万个自动机,那么每个节点将必须能够管理大约十万个自动机.我看到系统是用这种规模甚至更大的DDS构建的,每个实例每秒更新数十次.令人高兴的是,该技术可以随着节点数量的扩展而很好地扩展,因此您可以通过添加更多节点来降低每个节点的资源负载.

It is a bit hard to explain without a white board, but the end-result would be a single instance (which are a sort of very light-weight message queues) for most automatons, and two or three instances for those automatons at the border of a partition. If you had ten nodes and one million automata, then each node would have to be able to hold administration for approximately hundred thousand automata. I have seen systems being built with DDS of that scale, and larger, with tens of updates per second for each instance. The nice thing is that this technology scales well with the number of nodes, so you could bring down the resource load per node by adding more nodes.

如果这是一个研究项目,那么您甚至可以免费使用商业产品.只需在dds研究许可上谷歌搜索即可.

If this is a research project, then you might even be able to use a commercial product without charge. Just google on dds research license.

这篇关于哪些分布式消息队列支持数百万个队列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆