UTF-8 在 Python 日志记录中，如何? [英] UTF-8 In Python logging, how?

查看：19 发布时间：2021/12/19 10:29:38 python logging unicode

本文介绍了UTF-8 在 Python 日志记录中，如何?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用 Python 的日志记录包将 UTF-8 编码的字符串记录到文件中.以玩具为例:

导入日志def logging_test():handler = logging.FileHandler("/home/ted/logfile.txt", "w",编码 = "UTF-8")formatter = logging.Formatter("%(message)s")handler.setFormatter(格式化程序)root_logger = logging.getLogger()root_logger.addHandler(处理程序)root_logger.setLevel(logging.INFO)# 这是一个带帽子的 o.byte_string = 'xc3xb4'unicode_string = unicode("xc3xb4", "utf-8")打印打印的 unicode 对象:%s"% unicode_string# 爆炸root_logger.info(unicode_string)如果 __name__ == "__main__":日志测试()

这会在 logging.info() 调用中因 UnicodeDecodeError 而爆炸.

在较低级别，Python 的日志包使用 codecs 包打开日志文件，传入UTF-8"参数作为编码.这一切都很好，但它试图将字节字符串而不是 unicode 对象写入文件，这会爆炸.本质上，Python 正在这样做:

file_handler.write(unicode_string.encode("UTF-8"))

什么时候应该这样做:

file_handler.write(unicode_string)

这是 Python 中的错误，还是我服用了疯狂的药丸?FWIW，这是一个库存的 Python 2.6 安装.

解决方案

检查您是否拥有最新的 Python 2.6 - 自 2.6 发布以来，已发现并修复了一些 Unicode 错误.例如，在我的 Ubuntu Jaunty 系统上，我运行了复制并粘贴的脚本，仅从日志文件名中删除了/home/ted/"前缀.结果(从终端窗口复制并粘贴):

<前>vinay@eta-jaunty:~/projects/scratch$ python --version蟒蛇 2.6.2vinay@eta-jaunty:~/projects/scratch$ python utest.py打印的 unicode 对象:ôvinay@eta-jaunty:~/projects/scratch$ cat logfile.txt?vinay@eta-jaunty:~/projects/scratch$

在 Windows 机器上:

<前>C: emp>python --version蟒蛇 2.6.2C: emp>python utest.py打印的 unicode 对象:ô

以及文件内容:

这也可以解释为什么 Lennart Regebro 也无法复制它.

I'm trying to log a UTF-8 encoded string to a file using Python's logging package. As a toy example:

import logging

def logging_test():
    handler = logging.FileHandler("/home/ted/logfile.txt", "w",
                                  encoding = "UTF-8")
    formatter = logging.Formatter("%(message)s")
    handler.setFormatter(formatter)
    root_logger = logging.getLogger()
    root_logger.addHandler(handler)
    root_logger.setLevel(logging.INFO)

    # This is an o with a hat on it.
    byte_string = 'xc3xb4'
    unicode_string = unicode("xc3xb4", "utf-8")

    print "printed unicode object: %s" % unicode_string

    # Explode
    root_logger.info(unicode_string)

if __name__ == "__main__":
    logging_test()

This explodes with UnicodeDecodeError on the logging.info() call.

At a lower level, Python's logging package is using the codecs package to open the log file, passing in the "UTF-8" argument as the encoding. That's all well and good, but it's trying to write byte strings to the file instead of unicode objects, which explodes. Essentially, Python is doing this:

file_handler.write(unicode_string.encode("UTF-8"))

When it should be doing this:

file_handler.write(unicode_string)

Is this a bug in Python, or am I taking crazy pills? FWIW, this is a stock Python 2.6 installation.

解决方案

Check that you have the latest Python 2.6 - some Unicode bugs were found and fixed since 2.6 came out. For example, on my Ubuntu Jaunty system, I ran your script copied and pasted, removing only the '/home/ted/' prefix from the log file name. Result (copied and pasted from a terminal window):

vinay@eta-jaunty:~/projects/scratch$ python --version
Python 2.6.2
vinay@eta-jaunty:~/projects/scratch$ python utest.py 
printed unicode object: ô
vinay@eta-jaunty:~/projects/scratch$ cat logfile.txt 
ô
vinay@eta-jaunty:~/projects/scratch$

On a Windows box:

C:	emp>python --version
Python 2.6.2

C:	emp>python utest.py
printed unicode object: ô

And the contents of the file:

This might also explain why Lennart Regebro couldn't reproduce it either.

这篇关于UTF-8 在 Python 日志记录中，如何?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

UTF-8 在 Python 日志记录中，如何? [英] UTF-8 In Python logging, how?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

UTF-8 在 Python 日志记录中，如何? [英] UTF-8 In Python logging, how?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭