Python:Kombu + RabbitMQ死锁-队列被阻塞或阻塞 [英] Python: Kombu+RabbitMQ Deadlock - queues are either blocked or blocking

查看:238
本文介绍了Python:Kombu + RabbitMQ死锁-队列被阻塞或阻塞的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题



我有一个



系统指标



为了安全起见,我检查了服务器负载。如预期的那样,平均负载和CPU利用率指标很低。





为什么兔子MQ每个都会出现死锁?

解决方案

可能是由于RabbitMQ 3.6.2管理模块中的内存泄漏引起的。现在,它已在RabbitMQ 3.6.3中修复,可以在此处使用。



,但在RabbitMQ留言板上也进行了广泛的讨论;例如,此处此处。众所周知,这会引起很多奇怪的问题,一个很好的例子是在此处报告的问题



作为新版本发布之前的临时修复,您可以升级到最新版本,降级到3.6.1或完全禁用管理模块。


The problem

I have a RabbitMQ Server that serves as a queue hub for one of my systems. In the last week or so, its producers come to a complete halt every few hours.

What have I tried

Brute force

  • Stopping the consumers releases the lock for a few minutes, but then blocking returns.
  • Restarting RabbitMQ solved the problem for a few hours.
  • I have some automatic script that does the ugly restarts, but it's obviously far from a proper solution.

Allocating more memory

Following cantSleepNow's answer, I have increased the memory allocated to RabbitMQ to 90%. The server has a whopping 16GB of memory and the message count is not very high (millions per day), so that does not seem to be the problem.

From the command line:

sudo rabbitmqctl set_vm_memory_high_watermark 0.9

And with /etc/rabbitmq/rabbitmq.config:

[
   {rabbit,
   [
     {loopback_users, []},
     {vm_memory_high_watermark, 0.9}
   ]
   }
].

Code & Design

I use Python for all consumers and producers.

Producers

The producers are API server that serve calls. Whenever a call arrives, a connection is opened, a message is sent and the connection is closed.

from kombu import Connection

def send_message_to_queue(host, port, queue_name, message):
    """Sends a single message to the queue."""
    with Connection('amqp://guest:guest@%s:%s//' % (host, port)) as conn:
        simple_queue = conn.SimpleQueue(name=queue_name, no_ack=True)
        simple_queue.put(message)
        simple_queue.close()

Consumers

The consumers slightly differ from each other, but generally use the following pattern - opening a connection, and waiting on it until a message arrives. The connection can stay opened for long period of times (say, days).

with Connection('amqp://whatever:whatever@whatever:whatever//') as conn:
    while True:
        queue = conn.SimpleQueue(queue_name)
        message = queue.get(block=True)
        message.ack()

Design reasoning

  • Consumers always need to keep an open connection with the queue server
  • The Producer session should only live during the lifespan of the API call

This design had caused no problems till about one week ago.

Web view dashboard

The web console shows that the consumers in 127.0.0.1 and 172.31.38.50 block the consumers from 172.31.38.50, 172.31.39.120, 172.31.41.38 and 172.31.41.38.

System metrics

Just to be on the safe side, I checked the server load. As expected, the load average and CPU utilization metrics are low.

Why does the rabbit MQ each such a deadlock?

解决方案

This is most likely caused by a memory leak in the management module for RabbitMQ 3.6.2. This has now been fixed in RabbitMQ 3.6.3, and is available here.

The issue itself is described here, but is also discussed extensively on the RabbitMQ messages boards; for example here and here. This has also been known to cause a lot of weird issues, a good example is the issue reported here.

As a temporary fix until the new version is released, you can either upgrade to the new est build, downgrade to 3.6.1 or completely disable the management module.

这篇关于Python:Kombu + RabbitMQ死锁-队列被阻塞或阻塞的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆