Python Celery-通过pid查找任务 [英] Python Celery - lookup task by pid

查看:441
本文介绍了Python Celery-通过pid查找任务的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一个非常简单的问题,也许- 我经常看到在我的系统上运行的celery任务进程在使用celery.task.control.inspect()active()方法时找不到.通常,此过程将运行几个小时,我担心这是某种僵尸.通常,它也会占用大量内存.

A pretty straightforward question, maybe - I often see a celery task process running on my system that I cannot find when I use celery.task.control.inspect()'s active() method. Often this process will be running for hours, and I worry that it's a zombie of some sort. Usually it's using up a lot of memory, too.

有没有办法通过linux pid查找任务?芹菜或AMPQ结果后端会保存吗?

Is there a way to look up a task by linux pid? Does celery or the AMPQ result backend save that?

如果不是,是否有其他方法可以确定正在消耗内存的任务是什么?

If not, any other way to figure out which particular task is the one that's sitting around eating up memory?

----更新:

active()告诉我特定盒子上没有正在运行的任务,但是盒子的内存已满,并且htop显示这些工作池线程是使用它的线程时,该怎么办?同时使用0%CPU?如果事实证明这与我当前的机架空间设置有些怪异,并且没有人可以回答,我仍然会接受Loren's.

What can I do when active() tells me that there are no tasks running on a particular box, but the box's memory is in full use, and htop is showing that these worker pool threads are the ones using it, but at the same time using 0% CPU? if it turns out this is related to some quirk of my current rackspace setup and nobody can answer, I'll still accept Loren's.

谢谢〜

推荐答案

我将假设任务"是指工人".否则,这个问题将毫无意义.

I'm going to make the assumption that by 'task' you mean 'worker'. The question would make little sense otherwise.

在某些情况下,了解Celery工作池的流程层次结构很重要.工作池是一组共享相同配置(同一队列集合的进程消息等)的工作进程(或线程).每个池都有一个单独的父进程来管理该池.此过程控制了多少分叉的童工,并负责在孩子死亡时分叉替换的孩子.父进程是绑定到AMQP的唯一进程,子进程通过IPC从父进程摄取和处理任务.父进程本身实际上并不处理(运行)任何任务.

For some context it's important to understand the process hierarchy of Celery worker pools. A worker pool is a group of worker processes (or threads) that share the same configuration (process messages of the same set of queues, etc.). Each pool has a single parent process that manages the pool. This process controls how many child workers are forked and is responsible for forking replacement children when children die. The parent process is the only process bound to AMQP and the children ingest and process tasks from the parent via IPC. The parent process itself does not actually process (run) any tasks.

此外,在回答您的问题时,上级流程是负责响应您的芹菜检查广播的流程,池中列为工作人员的PID只是子工作人员.父PID不包括在内.

Additionally, and towards an answer to your question, the parent process is the process responsible for responding to your Celery inspect broadcasts, and the PIDs listed as workers in the pool are only the child workers. The parent PID is not included.

如果使用--pidfile命令行参数启动Celery守护程序,则该文件将包含父进程的PID,并且您应该能够将该PID与所引用的进程进行交叉引用.确定它是否实际上是池父进程.如果您使用Celery multi启动多个实例(多个工作池),则默认情况下,PID文件应位于您从中调用Celery multi的目录中.如果您没有使用任何一种方法来启动Celery,请尝试使用其中一种方法来验证该过程不是僵尸,并且实际上只是父进程.

If you're starting the Celery daemon using the --pidfile command-line parameter, that file will contain the PID of the parent process and you should be able to cross-reference that PID with the process you're referring to determine if it is in fact a pool parent process. If you're using Celery multi to start multiple instances (multiple worker pools) then by default PID files should be located in the directory from which you invoked Celery multi. If you're not using either of these means to start Celery try using one of them to verify that the process isn't a zombie and is in fact simply a parent.

这篇关于Python Celery-通过pid查找任务的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆