确保多个线程中的Python登录是线程安全的 [英] Ensuring Python logging in multiple threads is thread-safe

查看:28
本文介绍了确保多个线程中的Python登录是线程安全的的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 log.py 模块,该模块至少在另外两个模块( server.py device.py )中使用.

它具有这些全局变量:

  fileLogger = logging.getLogger()fileLogger.setLevel(logging.DEBUG)consoleLogger = logging.getLogger()consoleLogger.setLevel(logging.DEBUG)file_logging_level_switch = {'调试':fileLogger.debug,'信息':fileLogger.info,'警告':fileLogger.warning,'错误':fileLogger.error,'关键':fileLogger.critical}console_logging_level_switch = {'调试':consoleLogger.debug,'信息':consoleLogger.info,'警告':consoleLogger.warning,'错误':consoleLogger.error,'关键':consoleLogger.critical} 

它具有两个功能:

  def LoggingInit(logPath,logFile,html = True):全局fileLogger全局consoleLoggerlogFormatStr ="[[%(asctime)s%(threadName)s,%(levelname)s]%(message)s""consoleFormatStr ="[%(threadName)s,%(levelname)s]%(message)s"如果是html:logFormatStr =< p>"+ logFormatStr +</p>"#日志文件的文件处理程序logFormatter = logging.Formatter(logFormatStr)fileHandler = logging.FileHandler("{0} {1} .html" .format(logPath,logFile))fileHandler.setFormatter(logFormatter)fileLogger.addHandler(fileHandler)#stdout,stderr的流处理程序consoleFormatter = logging.Formatter(consoleFormatStr)consoleHandler = logging.StreamHandler()consoleHandler.setFormatter(consoleFormatter)consoleLogger.addHandler(consoleHandler) 

并且:

  def WriteLog(字符串,print_screen = True,remove_newlines = True,level ='debug'):如果remove_newlines:字符串= string.replace('\ r','').replace('\ n','')如果print_screen:console_logging_level_switch [level](字符串)file_logging_level_switch [level](字符串) 

我从 server.py 调用 LoggingInit ,这将初始化文件和控制台记录器.然后,我从各处调用 WriteLog ,因此多个线程正在访问 fileLogger consoleLogger .

我的日志文件是否需要任何进一步的保护?文档指出线程锁由处理程序处理.

解决方案

好消息是,您不需要为线程安全做任何额外的事情,或者不需要额外的事情或进行干净关机的琐碎事情.稍后我会详细介绍.

一个坏消息是,您的代码甚至在达到这一点之前就存在严重的问题: fileLogger consoleLogger 是同一对象.来自 getLogger() :

返回具有指定名称的记录器,或者,如果未指定名称,则返回作为层次结构的根记录器的记录器.

因此,您将获取根记录器并将其存储为 fileLogger ,然后将其获取为根记录器并将其存储为 consoleLogger .因此,在 LoggingInit 中,您可以初始化 fileLogger ,然后以不同的名称用不同的值重新初始化同一对象.

可以将多个处理程序添加到同一记录器中,并且由于您实际上对每个记录器进行的唯一初始化是 addHandler ,因此您的代码将按预期进行工作,但只是偶然.而且只是一种.如果您通过 print_screen = True ,您将在两个日志中获得每个消息的两个副本,即使您通过 print_screen = False ,您也将在控制台中获得副本.

实际上根本没有理由使用全局变量. getLogger()的全部要点是,您可以在每次需要它时调用它,并获取全局根记录器,因此您无需将其存储在任何地方.


一个较小的问题是您没有转义插入HTML的文本.在某些时候,您将尝试记录字符串"a< b" 并遇到麻烦.

不太严重的是,一系列

标签不在 内,在 内>不是有效的HTML文档.但是,很多查看器会自动处理此问题,或者您可以在显示日志之前对日志进行简单的后处理.但是如果你真的希望这是正确的,你需要子类化 FileHandler 并让你的 __init__ 如果给定一个空文件并删除一个页脚(如果存在),然后有您的 close 添加页脚.


回到您的实际问题:

您不需要任何其他锁定.如果处理程序正确实现了 createLock acquire release (并且在具有线程的平台上调用),日志记录机制将自动确保在需要时确保获取每条消息的原子记录的锁.

据我所知,文档没有直接 StreamHandler FileHandler 实现了这些方法,它确实暗示了这一点.(您在问题中提到的文本表示日志记录模块旨在确保线程安全,而无需客户进行任何特殊工作"等).您可以查看实现的来源(例如,CPython 3.3 ),并看到它们都从 logging.Handler 继承了正确实现的方法.


同样,如果处理程序正确实现了 flush close ,则日志记录机制将确保在正常关闭期间正确完成了该操作.

在这里,文档确实说明了 StreamHandler.flush() FileHandler.flush() FileHandler.close().除了 StreamHandler.close()是空操作外,它们基本上是您所期望的,这意味着到控制台的最终日志消息可能会丢失.从文档中:

请注意, close()方法是从 Handler 继承的,因此不会输出任何内容,因此可以使用显式的 flush()调用有时需要.

如果这对您很重要,并且您想修复它,则需要执行以下操作:

  class ClosingStreamHandler(logging.StreamHandler):def close():self.flush()super().close() 

然后使用 ClosingStreamHandler() 而不是 StreamHandler().

FileHandler 没有这样的问题.


将日志发送到两个地方的通常方法是仅将root logger与两个处理程序一起使用,每个处理程序都具有自己的格式化程序.

此外,即使您确实想要两个记录器,也不需要单独的 console_logging_level_switch file_logging_level_switch 映射;调用 Logger.debug(msg) 与调用 Logger.log(DEBUG, msg) 完全一样.您仍然需要某种方式将自定义级别名称 debug 等映射到标准名称 DEBUG 等,但是您只能执行一次查找,而不是每个记录器执行一次(此外,如果您的名字只是具有不同类型的标准名称,则可以作弊).

多个处理程序和格式化程序部分,以及日志手册的其余部分.

执行此操作的标准方法的唯一问题是,您不能轻易地逐个消息关闭控制台日志记录.那是因为这不是正常的事情.通常,您只需要按级别记录日志,然后在文件日志中将日志级别设置为更高.

但是,如果您想要更多控制权,则可以使用过滤器.例如,为您的 FileHandler 提供一个可以接受所有内容的过滤器,为您的 ConsoleHandler 提供一个需要以 console 开头的内容的过滤器,然后使用过滤器<如果是print_screen,则为code>控制台",否则为这样可以将 WriteLog 减少到几乎一线.

您仍然需要多余的两行来删除换行符,但是您甚至可以在过滤器中进行此操作,或者根据需要通过适配器进行操作.(同样,请参阅食谱.)然后 WriteLog 实际上单线.

I have a log.py module, that is used in at least two other modules (server.py and device.py).

It has these globals:

fileLogger = logging.getLogger()
fileLogger.setLevel(logging.DEBUG)
consoleLogger = logging.getLogger()
consoleLogger.setLevel(logging.DEBUG)

file_logging_level_switch = {
    'debug':    fileLogger.debug,
    'info':     fileLogger.info,
    'warning':  fileLogger.warning,
    'error':    fileLogger.error,
    'critical': fileLogger.critical
}

console_logging_level_switch = {
    'debug':    consoleLogger.debug,
    'info':     consoleLogger.info,
    'warning':  consoleLogger.warning,
    'error':    consoleLogger.error,
    'critical': consoleLogger.critical
}

It has two functions:

def LoggingInit( logPath, logFile, html=True ):
    global fileLogger
    global consoleLogger

    logFormatStr = "[%(asctime)s %(threadName)s, %(levelname)s] %(message)s"
    consoleFormatStr = "[%(threadName)s, %(levelname)s] %(message)s"

    if html:
        logFormatStr = "<p>" + logFormatStr + "</p>"

    # File Handler for log file
    logFormatter = logging.Formatter(logFormatStr)
    fileHandler = logging.FileHandler( 
        "{0}{1}.html".format( logPath, logFile ))
    fileHandler.setFormatter( logFormatter )
    fileLogger.addHandler( fileHandler )

    # Stream Handler for stdout, stderr
    consoleFormatter = logging.Formatter(consoleFormatStr)
    consoleHandler = logging.StreamHandler() 
    consoleHandler.setFormatter( consoleFormatter )
    consoleLogger.addHandler( consoleHandler )

And:

def WriteLog( string, print_screen=True, remove_newlines=True, 
        level='debug' ):

    if remove_newlines:
        string = string.replace('\r', '').replace('\n', ' ')

    if print_screen:
        console_logging_level_switch[level](string)

    file_logging_level_switch[level](string)

I call LoggingInit from server.py, which initializes the file and console loggers. I then call WriteLog from all over the place, so multiple threads are accessing fileLogger and consoleLogger.

Do I need any further protection for my log file? The documentation states that thread locks are handled by the handler.

解决方案

The good news is that you don't need to do anything extra for thread safety, and you either need nothing extra or something almost trivial for clean shutdown. I'll get to the details later.

The bad news is that your code has a serious problem even before you get to that point: fileLogger and consoleLogger are the same object. From the documentation for getLogger():

Return a logger with the specified name or, if no name is specified, return a logger which is the root logger of the hierarchy.

So, you're getting the root logger and storing it as fileLogger, and then you're getting the root logger and storing it as consoleLogger. So, in LoggingInit, you initialize fileLogger, then re-initialize the same object under a different name with different values.

You can add multiple handlers to the same logger—and, since the only initialization you actually do for each is addHandler, your code will sort of work as intended, but only by accident. And only sort of. You will get two copies of each message in both logs if you pass print_screen=True, and you will get copies in the console even if you pass print_screen=False.

There's actually no reason for global variables at all; the whole point of getLogger() is that you can call it every time you need it and get the global root logger, so you don't need to store it anywhere.


A more minor problem is that you're not escaping the text you insert into HTML. At some point you're going to try to log the string "a < b" and end up in trouble.

Less seriously, a sequence of <p> tags that isn't inside a <body> inside an <html> is not a valid HTML document. But plenty of viewers will take care of that automatically, or you can post-process your logs trivially before displaying them. But if you really want this to be correct, you need to subclass FileHandler and have your __init__ add a header if given an empty file and remove a footer if present, then have your close add a footer.


Getting back to your actual question:

You do not need any additional locking. If a handler correctly implements createLock, acquire, and release (and it's called on a platform with threads), the logging machinery will automatically make sure to acquire the lock when needed to make sure each message is logged atomically.

As far as I know, the documentation doesn't directly say that StreamHandler and FileHandler implement these methods, it does strongly imply it (the text you mentioned in the question says "The logging module is intended to be thread-safe without any special work needing to be done by its clients", etc.). And you can look at the source for your implementation (e.g., CPython 3.3) and see that they both inherit correctly-implemented methods from logging.Handler.


Likewise, if a handler correctly implements flush and close, the logging machinery will make sure it's finalized correctly during normal shutdown.

Here, the documentation does explain what StreamHandler.flush(), FileHandler.flush(), and FileHandler.close(). They're mostly what you'd expect, except that StreamHandler.close() is a no-op, meaning it's possible that final log messages to the console may get lost. From the docs:

Note that the close() method is inherited from Handler and so does no output, so an explicit flush() call may be needed at times.

If this matters to you, and you want to fix it, you need to do something like this:

class ClosingStreamHandler(logging.StreamHandler):
    def close(self):
        self.flush()
        super().close()

And then use ClosingStreamHandler() instead of StreamHandler().

FileHandler has no such problem.


The normal way to send logs to two places is to just use the root logger with two handlers, each with their own formatter.

Also, even if you do want two loggers, you don't need the separate console_logging_level_switch and file_logging_level_switch maps; calling Logger.debug(msg) is exactly the same thing as calling Logger.log(DEBUG, msg). You'll still need some way to map your custom level names debug, etc. to the standard names DEBUG, etc., but you can just do one lookup, instead of doing it once per logger (plus, if your names are just the standard names with different cast, you can cheat).

This is all described pretty well in the `Multiple handlers and formatters section, and the rest of the logging cookbook.

The only problem with the standard way of doing this is that you can't easily turn off console logging on a message-by-message basis. That's because it's not a normal thing to do. Usually, you just log by levels, and set the log level higher on the file log.

But, if you want more control, you can use filters. For example, give your FileHandler a filter that accepts everything, and your ConsoleHandler a filter that requires something starting with console, then use the filter 'console' if print_screen else ''. That reduces WriteLog to almost a one-liner.

You still need the extra two lines to remove newlines—but you can even do that in the filter, or via an adapter, if you want. (Again, see the cookbook.) And then WriteLog really is a one-liner.

这篇关于确保多个线程中的Python登录是线程安全的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆