分别记录stdout和stderr时截断的输出日志文件 [英] Truncated output log files when logging stdout and stderr separately

查看:44
本文介绍了分别记录stdout和stderr时截断的输出日志文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在上下文管理器中设置了一个子进程命令,该命令通过我自己的记录器将stdout和stderr传递给单独的文件.这是此处给出答案的变体: https://stackoverflow.com/a/4838875/4844311

I have set up a subprocess command within a context manager that pipes the stdout and stderr to separate files via my own logger. This is a variation on the answer given here: https://stackoverflow.com/a/4838875/4844311

我的代码如下:

import logging
import subprocess

with StreamLogger(logging.DEBUG, my_out_logger) as out:
    with StreamLogger(logging.ERROR, my_err_logger) as err:
        p = subprocess.Popen(cmd, shell=False, stdout=out, stderr=err)
        p.communicate()
        p.wait()

其中 my_out_logger my_err_logger 是使用记录到文件等的句柄记录对象.

where my_out_logger and my_err_logger are logging objects with handles that log to files etc.

StreamLogger代码类似于以上链接中给出的代码:

The StreamLogger code is similar to the code given at the link above:

import io
import os
import threading
import select
import time

class StreamLogger(io.IOBase):
    def __init__(self, level, logger):
        self.logger = logger
        self.level = level
        self.pipe = os.pipe()
        self.thread = threading.Thread(target=self._flusher)
        self.thread.start()

    def _flusher(self):
        self._run = True
        buf = b''
        while self._run:
            for fh in select.select([self.pipe[0]], [], [], 0)[0]:
                buf += os.read(fh, 1024)
                while b'\n' in buf:
                    data, buf = buf.split(b'\n', 1)
                    self.write(data.decode())
            time.sleep(0.01)
        self._run = None

    def write(self, data):
        return self.logger.log(self.level, data)

    def fileno(self):
        return self.pipe[1]

    def close(self):
        if self._run:
            self._run = False
            while self._run is not None:
                time.sleep(0.01)
            os.close(self.pipe[0])
            os.close(self.pipe[1])

我的代码与上面链接中答案提供的代码之间的唯一显着区别是,我的代码将日志记录消息发送到记录器,该记录器根据其句柄进行重定向,而不是像链接中的代码那样直接记录

The only significant difference between my code and the code provided by the answer at the link above is that my code sends the logging message to a logger that redirects according to its handles rather than logging directly, as in the code at the link.

这段代码在大多数情况下都可以正常工作.

This code all works fine most of the time.

但是我注意到,每隔一段时间就有一个被截断的输出日志文件.似乎在写入所有标准输出内容之前,已关闭由 my_out_logger 中的 FileHandler 写入的输出文件.

But I have noticed that every so often there is a truncated output logging file. It looks like the output file being written by a FileHandler in my_out_logger is being closed before all the stdout content has been written.

我不确定为什么会这样或在哪里修复代码.现在,我刚刚在 p.communicate() p.wait()之间添加了 time.sleep(0.3)语句,该语句减少了被截断的文件的频率,但这似乎是一个丑陋的解决方案.

I am not sure why this is happening or where to fix the code. For now I have just added a time.sleep(0.3) statement between p.communicate() and p.wait() and this reduced the frequency of truncated files, but it seems like an ugly solution.

我宁愿了解出了什么问题并正确解决.我欢迎任何建议或见识.

I would rather understand what is going wrong and fix it properly. I welcome any suggestions or insight.

推荐答案

我想我终于明白了.这里的问题是 StreamLogger 代码无法显式检查以确保完全编写 stdout .一旦运行 subprocess 的主线程接收到 returncode ,它将退出上下文管理器,并调用 StreamLogger的 __ exit __()方法.,它是从 IOBase (源代码此处).然后调用 close() self._run 属性更改为 False .这将使正在轮询管道的线程停止循环,无论管道中仍在经过什么.

I think I've finally figured it out. The problem here is that the StreamLogger code can't explicitly check to make sure the stdout is fully written. Once the main thread running the subprocess receives a returncode, it exits the context manager, calling the __exit__() method of the StreamLogger, which is inherited from IOBase (source code here). And this calls close() which changes the self._run attribute to False. This causes the thread that is polling the pipe to stop looping, regardless of what is still coming through the pipe.

这对于大多数命令输出到 stdout 的大多数命令都适用,在 returncode 返回和写入输出之间没有延迟时间.但就我而言,我正在通过 subprocess 运行程序,其中大量文本写入了 stdout .因此,要在主线程告诉子线程停止轮询管道之前尝试清除管道,这与时间赛跑.

This works fine for most commands with small to moderate outputs to stdout, where there is no lag time between the returncode returning and the output being written. But in my case I'm running a program through subprocess with a substantial amount of text written to stdout. So there's a kind of race against time to try and get the pipe cleared before the main thread tells the child thread to stop polling the pipe.

此处的两个关键变量是从管道读取的缓冲区的大小以及轮询管道的频率.我通过将 os.read()中的缓冲区大小增加到4096,并同时在 while time.sleep()来解决我的问题>在 _flusher()方法中循环.这样可以最大化从管道读取的数据量,在我的情况下,可以确保在日志记录循环停止之前已完全记录输出.

The two critical variables here are the size of the buffer being read from the pipe and the frequency at which the pipe is polled. I fixed my problem by increasing the buffer size in os.read() to 4096, and also removing the time.sleep() in the while loop in the _flusher() method. This maximizes the amount of data that can be read from the pipe and in my case ensures the output is fully logged before the logging loop stops.

这篇关于分别记录stdout和stderr时截断的输出日志文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆