使用Python 3的readlines（）进行Unicode错误处理 [英] Unicode error handling with Python 3's readlines()

查看：968 发布时间：2020/10/29 6:11:18 python python-3.x text encoding

本文介绍了使用Python 3的readlines（）进行Unicode错误处理的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在读取文本文件时，我一直收到此错误。

I keep getting this error while reading a text file. Is it possible to handle/ignore it and proceed?

UnicodeEncodeError：'charmap'编解码器无法解码位置$ b $的字节0x81是否可以处理/忽略它并继续进行？ b 7827：字符映射为未定义。

UnicodeEncodeError: ‘charmap’ codec can’t decode byte 0x81 in position 7827: character maps to undefined.

推荐答案

在Python 3中，传递适当的 errors = 值（例如 errors = ignore 或 errors = replace ）创建文件对象（假设它是 io.TextIOWrapper 的子类；如果不是，请考虑将其包装在一个对象中！）；另外，考虑传递比 charmap 更可能的编码（当不确定时， utf-8 总是很好的起点）。

In Python 3, pass an appropriate errors= value (such as errors=ignore or errors=replace) on creating your file object (presuming it to be a subclass of io.TextIOWrapper -- and if it isn't, consider wrapping it in one!); also, consider passing a more likely encoding than charmap (when you aren't sure, utf-8 is always a good place to start).

例如：

f = open('misc-notes.txt', encoding='utf-8', errors='ignore')

在Python 2中， read（）操作仅返回字节；然后，诀窍是将它们解码以将它们放入字符串中（实际上，如果需要，则需要字符而不是字节）。如果您对它们的真实编码没有更好的猜测：

In Python 2, the read() operation simply returns bytes; the trick, then, is decoding them to get them into a string (if you do, in fact, want characters as opposed to bytes). If you don't have a better guess for their real encoding:

your_string.decode('utf-8', 'replace')

...以替换未处理的字符，或者

...to replace unhandled characters, or

your_string.decode('utf-8', 'ignore')

只是忽略它们。

也就是说，找到并使用其 real 编码（而不是猜测 utf-8 ）。

That said, finding and using their real encoding (rather than guessing utf-8) would be preferred.

这篇关于使用Python 3的readlines（）进行Unicode错误处理的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用Python 3的readlines（）进行Unicode错误处理 [英] Unicode error handling with Python 3's readlines()

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用Python 3的readlines（）进行Unicode错误处理 [英] Unicode error handling with Python 3&#39;s readlines()

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

使用Python 3的readlines（）进行Unicode错误处理 [英] Unicode error handling with Python 3's readlines()

登录关闭