blpop 一段时间后停止处理队列 [英] blpop stops processing queue after a while

查看:53
本文介绍了blpop 一段时间后停止处理队列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的组织中,我们有许多 redis 工作人员来执行我们的关键任务.通常,一天一两次,我们的工作人员会停止处理队列.

At my organization, we have a number of redis workers in place for our critical tasks. Usually, once or twice in a day, our workers stop processing the queues.

代码基本上是这样的:

while ($item = $redis->blpop(array('someQueue', 'anotherQueue'), 3600)) {
    someFunction();
}

如果您看到,代码方面发生的事情并不多,但每隔一段时间,队列就会开始建立,而工作人员不会从队列中弹出任何项目.为 blpop 设置超时根本没有用,因为我们假设问题出在 redis 客户端连接上.

If you see, there's not much that is happening in terms of the code, but every once in a while, the queue starts building up and the worker doesn't pop any item from the queue. Setting the timeout for blpop is not useful at all because we presume that the problem is with the redis client connection.

目前,我们已经设置了一些侦听器,它们会在队列建立时提醒我们,然后我们重新启动工作程序,但问题仍然存在.我们也可以为我们的 redis 客户端设置一个超时时间,但这又不是一个理想的解决方案.

At the moment, we have set up a few listeners which alert us when the queue builds up and then we restart the workers but the problem still persists. We can also set a timeout for our redis client, but then again this is not an ideal solution.

  • 有没有其他人遇到过这种情况?
  • 可能是什么问题?
  • 我们是不是做错了什么?

我们的问题类似于错误使用 redis 实现消息队列,使用 BLPOP 时出错 但我们没有收到任何错误.工人突然停了下来.

Our question is similar to Error in implementing message queue using redis, error in using BLPOP but we do not get any errors. The worker just stops abruptly.

信息

Redis 服务器:2.8.2

Redis Server: 2.8.2

PHP Redis:phpredis

长时间运行的worker已经停止处理队列.运行CLIENT LIST 后,我们注意到这些worker 与其他worker 相比空闲时间较长,并且他们的标志设置为N 而不是b.这背后的原因可能是什么?

The workers which have been running for a long time have stopped processing the queue. After running CLIENT LIST we noticed that these workers have a high idle time compared to the rest and their flag is set to N instead of b. What might be the reason behind this?

问题出在 someFunction() 上.有一段代码导致函数无法返回控制权,因为客户端长时间空闲,因此运行 CLIENT LIST 时出现N"标志.

The problem was with someFunction(). There was a piece of code causing the function to not return control due to which the client was idling for a long time and hence the 'N' flag on running CLIENT LIST.

推荐答案

我建议验证是否存在问题并将问题作为问题报告给 Redis 项目如果你在服务器端找到了一些东西.但是,以下步骤将帮助您解决问题,即使是在堆栈的其他部分(这很可能,因为没有与上述类似的已知问题).

I suggest verifying if there is an issue and report the problem back to the Redis project as an issue if you find something server side. However the following steps will help you to fix the problem even if in some other part of your stack (which is likely, since there are no known problems similar to the one above).

检查正在发生的事情的步骤:

Steps to check what is happening:

  1. 等待一位客户停下来.
  2. 使用 LLEN 命令验证列表中是否确实存在元素.
  3. 检查 CLIENT LIST 是否确实列出了您的客户端,执行阻塞弹出(您将看到命令名称),并检查回复的大小以查看它是否是那是您的客户实际上并没有使用它得到的回复.
  1. Wait for one client to stop.
  2. Verify that there are actually elements in the list with the LLEN command.
  3. Check with CLIENT LIST that there is actually your client listed, executing a blocking pop (you'll see the command name), and check what is the size of the reply to see if it is that is your client which is not actually consuming the replies it gets.

随意评论:

  1. Redis 2.8.2.太旧了,建议升级.
  2. phpredis 可能存在导致此问题的错误,如果它与 Redis 服务器一样古老.

这篇关于blpop 一段时间后停止处理队列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆