Python中open和codecs.open的区别 [英] Difference between open and codecs.open in Python
问题描述
在 Python 中有两种打开文本文件的方法:
f = 打开(文件名)
和
导入编解码器f = codecs.open(filename, encoding="utf-8")
什么时候 codecs.open
比 open
更可取?
从 Python 2.6 开始,一个好的做法是使用 io.open()
,它也需要一个 encoding
code> 参数,就像现在过时的 codecs.open()
.在 Python 3 中,io.open
是 open()
内置函数的别名.所以 io.open()
适用于 Python 2.6 和所有更高版本,包括 Python 3.4.请参阅文档:http://docs.python.org/3.4/library/io.html
现在,对于原始问题:在 Python 2 中阅读文本(包括纯文本"、HTML、XML 和 JSON)时,您应该始终使用 io.open()
使用显式编码,或 open()
在 Python 3 中使用显式编码.这样做意味着您可以正确解码 Unicode,或者立即得到错误,使调试更容易.
纯 ASCII纯文本"是遥远过去的神话.正确的英文文本使用卷曲引号、破折号、项目符号、€(欧元符号)甚至分音符 (¨).不要天真!(不要忘记 Facade 设计模式!)
因为纯 ASCII 不是一个真正的选择,没有显式编码的 open()
仅用于读取二进制文件.>
There are two ways to open a text file in Python:
f = open(filename)
And
import codecs
f = codecs.open(filename, encoding="utf-8")
When is codecs.open
preferable to open
?
Since Python 2.6, a good practice is to use io.open()
, which also takes an encoding
argument, like the now obsolete codecs.open()
. In Python 3, io.open
is an alias for the open()
built-in. So io.open()
works in Python 2.6 and all later versions, including Python 3.4. See docs: http://docs.python.org/3.4/library/io.html
Now, for the original question: when reading text (including "plain text", HTML, XML and JSON) in Python 2 you should always use io.open()
with an explicit encoding, or open()
with an explicit encoding in Python 3. Doing so means you get correctly decoded Unicode, or get an error right off the bat, making it much easier to debug.
Pure ASCII "plain text" is a myth from the distant past. Proper English text uses curly quotes, em-dashes, bullets, € (euro signs) and even diaeresis (¨). Don't be naïve! (And let's not forget the Façade design pattern!)
Because pure ASCII is not a real option, open()
without an explicit encoding is only useful to read binary files.
这篇关于Python中open和codecs.open的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!