使用Python的多处理程序时如何继承父记录器?特别是对于paramiko [英] How can I inherit parent logger when using Python's multiprocessing? Especially for paramiko

查看:76
本文介绍了使用Python的多处理程序时如何继承父记录器?特别是对于paramiko的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Python的多处理程序.我已经在父进程中设置了记录器,但是我不能仅仅继承父父的日志记录设置.

I'm using Python's multiprocessing. I have set the logger in parent process, but I can't just simply inherit the parent's logging setting.

我不必担心混淆日志,因为我使用多重处理不是为了同时运行作业,而是为了控制时间,所以同一时间只运行一个子进程.

I don't worry about mixing up the log, for I use multiprocessing not for running jobs concurrently, but for time controlling, so only one sub process is running at same time.

我的代码没有进行多处理:

My code without multiprocessing:

from multiprocessing import Process
import paramiko
import logging
import sys


def sftp_read():
    # log.debug("Child process started")  # This line will cause exception if it is run in sub process.
    ssh = paramiko.SSHClient()
    ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    timeout = 60
    ssh.connect('my_server', username='my_user', password='my_password', timeout=timeout, auth_timeout=timeout,
                banner_timeout=timeout)
    sftp = ssh.open_sftp()
    fp = sftp.file('/home/my_user/my_file.txt')
    lines = fp.readlines()
    print ''.join(lines)
    fp.close()
    ssh.close()


def main():
    sftp_read()  # Call this function without multiprocessing

if __name__ == '__main__':
    logging.basicConfig(stream=sys.stdout,
                        format='[%(asctime)s] {%(filename)s:%(lineno)d} %(levelname)s - %(message)s')
    log = logging.getLogger()
    log.setLevel(logging.DEBUG)
    main()

以上代码正常工作,paramiko正常打印日志,如下所示:

The above code work properly, paramiko prints log normally like below:

[2018-11-20 10:38:45,051] {transport.py:1746} DEBUG - starting thread (client mode): 0x3052208L
[2018-11-20 10:38:45,051] {transport.py:1746} DEBUG - Local version/idstring: SSH-2.0-paramiko_2.4.2
[2018-11-20 10:38:45,405] {transport.py:1746} DEBUG - Remote version/idstring: SSH-2.0-OpenSSH_7.2p2 Ubuntu-4ubuntu2.6
[2018-11-20 10:38:45,405] {transport.py:1746} INFO - Connected (version 2.0, client OpenSSH_7.2p2)

但是当我将main函数更改为以下代码来控制时间时(将SFTP读取的最大运行时间限制为15秒):

But when I change the main function into the following code to control the time (limit the max running time of the SFTP reading to 15 seconds):

def main():
    # Use multiprocessing to limit the running time to at most 15 seconds.
    p = Process(target=sftp_read)
    try:
        log.debug("About to start SSH")
        p.start()
        log.debug('Process started')
        p.join(15)
    finally:
        if p.is_alive():
            p.terminate()
            log.debug('Terminated')
        else:
            log.debug("Finished normally")

Paramiko不再打印日志.现在,我要将日志记录配置设置为与父配置相同,该怎么办?

Paramiko no longer prints log. Now I want to set the logging config to as same as the parent one, how can I do it?

我不希望有一个答案告诉我再次获得记录器,因为在生产服务器中有一个全局记录设置,并且可能会不时更改,因此我无法配置自己的记录设置.由全局设置控制.

I don't want an answer telling me to get a logger again, because in my production server there is a global logging setting and might be changed every now and then, I can't configure my own logging setting which is not controlled by the global setting.

所以我想知道是否有一种方法可以让我将子进程的日志记录设置配置为父级.

So I wonder if there is a way allowing me to configure my sub process's logging setting as the parent one.

推荐答案

在Python中,子流程是根据POSIX标准启动的. POSIX标准中的子流程是使用fork系统调用创建的.使用fork创建的子进程本质上是父进程内存中所有内容的副本.对于您而言,子进程将可以从父进程访问记录器.

In Python, subprocesses are started on POSIX standard. SubProcesses in POSIX standard are created using fork system call. The child process created using fork is essentially a copy of everything in the parent process' memory. In your case, child process will have access to logger from parent.

警告:fork复制所有内容;但是,不会复制threads.在父进程中运行的任何线程在子进程中都不存在.

Warning: fork copies everything; but, does not copy threads. Any threads running in parent process do not exist in child process.

import logging
from multiprocessing import Pool
from os import getpid

def runs_in_subprocess():
    logging.info(
        "I am the child, with PID {}".format(getpid()))

if __name__ == '__main__':
    logging.basicConfig(
        format='GADZOOKS %(message)s', level=logging.DEBUG)

    logging.info(
        "I am the parent, with PID {}".format(getpid()))

    with Pool() as pool:
        pool.apply(runs_in_subprocess)

输出:

GADZOOKS I am the parent, with PID 3884
GADZOOKS I am the child, with PID 3885

请注意池中的子进程如何继承父进程的日志记录配置

Notice how child processes in your pool inherit the parent process’ logging configuration

您可能会遇到deadlocks的问题,因此请注意以下事项:

You might get into problem of deadlocks so beware of following:

  1. 只要父进程中的线程写入日志消息,就会将其添加到队列中.这涉及获取锁.

  1. Whenever the thread in the parent process writes a log messages, it adds it to a Queue. That involves acquiring a lock.

如果fork()发生在错误的时间,则会以获取状态复制该锁.

If the fork() happens at the wrong time, the lock is copied in an acquired state.

子进程复制父进程的日志记录配置-包括队列. 每当子进程写入日志消息时,它将尝试将其写入队列.

The child process copies the parent’s logging configuration—including the queue. Whenever the child process writes a log message, it tries to write it to the queue.

这意味着要获取锁,但已经获取了锁.

That means acquiring the lock, but the lock is already acquired.

子进程现在等待释放锁.

The child process now waits for the lock to be released.

该锁将永远不会被释放,因为将要释放该锁的线程并没有被fork()复制.

The lock will never be released, because the thread that would release it wasn’t copied over by the fork().

在python3中,您可以使用get_context避免这种情况.

In python3, you could avoid this using get_context.

from multiprocessing import get_context

def your_func():
    with get_context("spawn").Pool() as pool:
        # ... everything else is unchanged

建议:

  1. 使用get_context创建一个新的Pool并使用该Pool中的进程来为您完成工作.

  1. Using get_context create a new Pool and use process' within this pool to do the job for you.

池中的每个进程都可以访问父进程的日志配置.

Every process from the pool will have access to the parent process' log config.

这篇关于使用Python的多处理程序时如何继承父记录器?特别是对于paramiko的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆