限制通过Python脚本一次运行的进程数 [英] Limiting the number of processes running at a time from a Python script

查看:324
本文介绍了限制通过Python脚本一次运行的进程数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在运行一个备份脚本,该脚本启动子进程以通过rsync执行备份.但是我没有办法限制一次启动的rsync的数量.

I'm running a backup script that launches child processes to perform backups by rsync. However I have no way to limit the number of rsyncs it launches at a time.

这是我目前正在处理的代码:

Here's the code I'm working on at the moment:

print "active_children: ", multiprocessing.active_children()
print "active_children len: ", len(multiprocessing.active_children())
while len(multiprocessing.active_children()) > 49:
   sleep(2)
p = multiprocessing.Process(target=do_backup, args=(shash["NAME"],ip,shash["buTYPE"], ))
jobs.append(p)
p.start()

当我运行数百个rsync时,这显示最多一个孩子.这是实际启动rsync的代码(从do_backup函数内部),其中command是包含rsync行的变量:

This is showing a maximum of one child when I have hundreds of rsyncs running. Here's the code that actually launches the rsync (from inside the do_backup function), with command being a variable containing the rsync line:

print command
subprocess.Popen(command, stdout=subprocess.PIPE, shell=True)
return 1

如果我在do_backup函数中添加sleep(x),它将在睡眠时显示为活动的孩子.进程表还显示rsync进程的PPID为1.据此,我假设rsync会分离并且不再是python的子进程,这会使我的子进程死亡,因此我无法再对其进行计数了. .有谁知道如何让python孩子继续生存并被计数直到rsync完成?

If I add a sleep(x) to the do_backup function it will show up as an active child while it's sleeping. Also the process table is showing the rsync processes as having a PPID of 1. I'm assuming from this that the rsync splits off and is no longer a child of python which allows my child process to die so I can't count it anymore. Does anyone know how to keep the python child alive and being counted until the rsync is complete?

推荐答案

让我们先清除一些误解

据此,我假设rsync分裂了,不再是 python的子代,它允许我的子代进程死亡,所以我无法计数 它了.

I'm assuming from this that the rsync splits off and is no longer a child of python which allows my child process to die so I can't count it anymore.

rsync执行拆分".在UNIX系统上,这称为.

rsync does "split off". On UNIX systems, this is called a fork.

在派生一个进程时,会创建一个子进程-因此rsync 是python的子级.这个孩子独立于父母执行-并且并发执行(同时").

When a process forks, a child process is created - so rsync is a child of python. This child executes independently of the parent - and concurrently ("at the same time").

一个进程可以管理自己的子进程.为此,有特定的系统调用,但是在谈论python时,话题有些偏离主题了,它具有自己的高级界面

A process can manage its own children. There are specific syscalls for that, but it's a bit off-topic when talking about python, which has its own high-level interfaces

如果您查看 subprocess.Popen的文档,您会注意到它根本不是函数调用:它是一个类.通过调用它,您将创建该类的实例- Popen对象. 这样的对象有多种方法.特别是 wait 将允许您阻止您的父进程(python),直到子进程终止.

If you check subprocess.Popen's documentation, you'll notice that it's not a function call at all: it's a class. By calling it, you'll create a instance of that class - a Popen object. Such objects have multiple methods. In particular, wait will allow you to block your parent process (python) until the child process terminates.

考虑到这一点,让我们看一下您的代码并将其简化一下:

With this in mind, let's take a look at your code and simplify it a bit:

p = multiprocessing.Process(target=do_backup, ...)

在这里,您实际上是在分叉并创建一个子进程. 此进程是另一个python解释器(与所有multiprocessing进程一样),并将执行do_backup函数.

Here, you're actually forking and creating a child process. This process is another python interpreter (as with all multiprocessing processes), and will execute the do_backup function.

def do_backup()
    subprocess.Popen("rsync ...", ...)

在这里,您要再次.您将创建另一个进程(rsync),并让它在后台"运行,因为您没有wait进行此操作.

Here, you are forking again. You'll create yet another process (rsync), and let it run "in the background", because you're not waiting for it.

所有这些都已清除,希望您能看到使用现有代码的一种方法.如果要降低复杂度,建议您检查并改编JoErNanO的答案,该答案可重复使用multiprocessing.Pool来自动跟踪流程.

With all this cleared up, I hope you can see a way forward with your existing code. If you want to reduce it's complexity, I recommend you check and adapt JoErNanO's answer, that reuses multiprocessing.Pool to automate keeping track of the processes.

无论您决定采用哪种方式,都应避免使用Popen进行分叉来创建rsync流程-因为这会不必要地创建另一个流程.相反,请检查os.execv,它用另一个

Whichever way you decide to pursuit, you should avoid forking with Popen to create the rsync process - as that creates yet another process unnecessarily. Instead, check os.execv, which replaces the current process with another

这篇关于限制通过Python脚本一次运行的进程数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆