RabbitMQ的工作池和多租户队列 [英] Worker pools and multi-tenant queues with RabbitMQ

查看:246
本文介绍了RabbitMQ的工作池和多租户队列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在一个基于多租户云应用程序的Web应用程序上工作(很多客户端,每个客户端都有自己单独的环境",但是全部在共享的硬件集上),我们正在为用户介绍该功能分批处理工作以备后用.批处理工作的类型实际上并不重要,只是数量足够大,以至于没有工作队列就不实际.我们选择RabbitMQ作为我们的基础队列框架.

I work on a web application that is a multi-tenant cloud based application (lots of clients, each with their own separate "environment", but all on shared sets of hardware) and we're introducing the ability for a user to batch up work for later processing. The types of batched work really isn't important, it's just of sufficient quantity that doing it without a work queue isn't really practical. We've selected RabbitMQ as our underlying queue framework.

因为我们是一个多租户应用程序,所以我们不一定希望客户端能够导致另一个客户端的冗长的队列处理时间,因此我们提出的一个想法是在每个客户端上创建一个队列并在我们所有的客户队列中都指向一个共享的工作池.问题是,据我所知,工作人员直接绑定到特定的队列,而不是交换.在我们理想的世界中,仍将处理共享队列中的客户端队列,而一个客户端不会阻塞另一个客户端,该共享池可以通过启动更多的工作线程或关闭空闲的工作线程来根据需要进行增长或收缩.从实际的意义上讲,将工作人员绑定到特定的队列可以阻止我们这样做,因为我们经常有很多工作人员只是闲着闲着闲着闲逛,没有活动.

Because we're a multi-tenant app, we don't necessarily want clients to be able to cause lengthy queue process times for another client, so one idea that we've floated up is creating a queue on a per client basis and having a shared worker pool pointed across ALL our client queues. The problem is that, to the best that I can figure, workers are directly bound to a specific queue, not an exchange. In our ideal world, our client queues will still be processed, without one client blocking another, from a shared worker pool that we can grow or shrink as necessary by launching more workers or closing down idle ones. Having workers tied to a specific queue prevents us from this in a practical sense, as we'd frequently have lots of workers just idling on a queue with no activity.

是否有一个相对简单的方法来完成此任务?我对RabbitMQ相当陌生,实际上还无法完成我们追求的目标.我们也不想编写一个非常复杂的多线程消费者应用程序,这在开发和测试时间上浪费了我们可能无法承受的时间.如果这是基于Windows的,那么我们的堆栈就是基于Windows/.Net/C#的,但是我认为这不会对当前的问题产生重大影响.

Is there a relatively straight forward to accomplish this? I'm fairly new to RabbitMQ and haven't really been able to accomplish what we're after. We also don't want to have to write a very complex multithreaded consumer application either, that's a time sink in dev and test time that we likely can't afford. Our stack is Windows/.Net/C# based if that's germaine, but I don't think that should have a major bearing in the question at hand.

推荐答案

您可以查看优先级队列实现(最初询问此问题时未实现):

You could look at the priority queue implementation (which wasn't implemented when this question was originally asked): https://www.rabbitmq.com/priority.html

如果这对您不起作用,您可以尝试其他一些黑客方式来实现所需的功能(该功能应与RabbitMQ的较早版本一起使用):

If that doesn't work for you, you could try some other hacks to achieve what you want (which should work with older versions of RabbitMQ):

您可以将100个队列绑定到一个主题交换,并将路由密钥设置为用户ID%100的哈希,即每个任务将具有1到100之间的密钥,并且同一用户的任务将具有相同的密钥.每个队列都有一个介于1到100之间的唯一模式.现在,您拥有了一批工人,它们从一个随机的队列号开始,然后在每个作业之后递增该队列号,再次为%100,以循环回到队列100之后的队列1.

You could have 100 queues bound to a topic exchange and set the routing key to a hash of the user ID % 100, i.e. each task will have a key between 1 and 100 and tasks for the same user will have the same key. Each queue is bound with a unique pattern between 1 and 100. Now you have a fleet of workers which start with a random queue number and then increment that queue number after each job, again % 100 to cycle back to queue 1 after queue 100.

现在,您的工作人员团队可以并行处理多达100个唯一用户,或者,如果没有其他工作要做,则所有工作人员都可以专注于单个用户.如果工作人员需要在每个作业之间循环浏览所有100个队列,那么在只有一个用户在一个队列上有很多作业的情况下,您自然会在每个作业之间有一些开销.较少数量的队列是处理此问题的一种方法.您还可以让每个工作线程与每个队列建立连接,并从每个队列中消耗多达一条未确认的消息.然后,只要将未确认的消息超时设置为足够高,工作人员就可以更快地循环遍历内存中的待处理消息.

Now your worker fleet can process up to 100 unique users in parallel, or all the workers can focus on a single user if there is no other work to do. If the workers need to cycle through all 100 queues between each job, in the scenario that only a single user has lot of jobs on a single queue, you're naturally going to have some overhead between each job. A smaller number of queues is one way to deal with this. You could also have each worker hold a connection to each of the queues and consume up to one un-acknowledged message from each. The worker can then cycle through the pending messages in memory much faster, provided the un-acknowledged message timeout is set sufficiently high.

或者,您可以创建两个交换,每个交换都有一个绑定的队列.所有工作都进入第一个交换和队列,这是一组工作人员消耗的.如果一个工作单元花费的时间太长,工人可以取消它并将其推入第二个队列.工人仅在第一个队列中没有任何内容时才处理第二个队列.您可能还希望有几个具有相反队列优先级的工作程序,以确保在有永无休止的短任务流到达时,仍在处理长时间运行的任务,以便最终将始终处理用户批处理.这不会真正在所有任务上分配您的工作人员队伍,但是它将阻止一个用户的长时间运行任务阻止您的工作人员为同一位用户或另一位用户执行短期任务.它还假定您可以取消作业并稍后重新运行它而没有任何问题.这也意味着超时的任务将浪费资源,因此需要将其作为低优先级重新运行.除非您可以提前确定快任务和慢任务

Alternatively you could create two exchanges, each with a bound queue. All work goes to the first exchange and queue, which a pool of workers consume. If a unit of work takes too long the worker can cancel it and push it to the second queue. Workers only process the second queue when there's nothing on the first queue. You might also want a couple of workers with the opposite queue prioritization to make sure long running tasks are still processed when there's a never ending stream of short tasks arriving, so that a users batch will always be processed eventually. This won't truly distribute your worker fleet across all tasks, but it will stop long running tasks from one user holding up your workers from executing short running tasks for that same user or another. It also assumes you can cancel a job and re-run it later without any problems. It also means there will be wasted resources from tasks that timeout and need to be re-run as low priority. Unless you can identify fast and slow tasks in advance

如果单个用户有100个慢任务,则另一个建议包含100个队列的建议也可能有问题,然后另一个用户发布了一批任务.在完成一项缓慢的任务之前,将不会查看这些任务.如果结果是一个合理的问题,则可以将两种解决方案结合起来.

The first suggestion with the 100 queues could also have a problem if there are 100 slow tasks for a single user, then another user posts a batch of tasks. Those tasks won't get looked at until one of the slow tasks is finished. If this turns out to be a legitimate problem you could potentially combine the two solutions.

这篇关于RabbitMQ的工作池和多租户队列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆