r 和 rb 模式下解析文本文件的区别 [英] Difference between parsing a text file in r and rb mode
问题描述
是什么让在 'r' 模式下解析文本文件比在 'rb' 模式下解析它更方便?特别是当有问题的文本文件可能包含非 ASCII 字符时.
What makes parsing a text file in 'r' mode more convenient than parsing it in 'rb' mode? Especially when the text file in question may contain non-ASCII characters.
推荐答案
这在一定程度上取决于您使用的 Python 版本.在 Python 2 中,Chris Drappier 的回答适用.
This depends a little bit on what version of Python you're using. In Python 2, Chris Drappier's answer applies.
在 Python 3 中,它是一个不同(并且更加一致)的故事:在文本模式 ('r'
) 中,Python 将根据您提供的文本编码解析文件(或者,如果你不给,一个平台相关的默认值),read()
会给你一个 str
.在二进制 ('rb'
) 模式下,Python 不假设文件包含可以合理解析为字符的内容,并且 read()
给你一个 字节
对象.
In Python 3, its a different (and more consistent) story: in text mode ('r'
), Python will parse the file according to the text encoding you give it (or, if you don't give one, a platform-dependent default), and read()
will give you a str
. In binary ('rb'
) mode, Python does not assume that the file contains things that can reasonably be parsed as characters, and read()
gives you a bytes
object.
此外,在 Python 3 中,通用换行符('
'
和特定于平台的换行符约定之间的转换,因此您不必关心它们)可用于文本模式任何平台上的文件,而不仅仅是 Windows.
Also, in Python 3, the universal newlines (the translating between '
'
and platform-specific newline conventions so you don't have to care about them) is available for text-mode files on any platform, not just Windows.
这篇关于r 和 rb 模式下解析文本文件的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!