python 3.0 open() 默认编码 [英] python 3.0 open() default encoding

查看:97
本文介绍了python 3.0 open() 默认编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试计算 JSON 文件中的行数.单击此处访问我的 JSON 文件.

I am trying to count the lines in a JSON file. Click HERE to access my JSON file .

我尝试使用下面的代码来计算行数.

I tried to use the below code to count the lines.

input = open("json/world_bank.json")
i=0
for l in input:
    i+=1
print(i)

但是上面的代码抛出了一个 UniCodeDecode 错误,如下所示.

But the above code is throwing a UniCodeDecode Error as shown below.

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-17-edc88ade7225> in <module>()
      2 
      3 i=0
----> 4 for l in input:
      5     i+=1
      6 

C:UsersSubbi ReddyAppDataLocalContinuumAnaconda3libencodingscp1252.py in decode(self, input, final)
     21 class IncrementalDecoder(codecs.IncrementalDecoder):
     22     def decode(self, input, final=False):
---> 23         return codecs.charmap_decode(input,self.errors,decoding_table)[0]
     24 
     25 class StreamWriter(Codec,codecs.StreamWriter):

UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 3979: character maps to <undefined>

然后我在 open 函数中包含了 encoding 参数,如下所示.

Then i included encoding parameter in open function as shown below.

input = open("json/world_bank.json",encoding="utf8")

然后它开始工作并给出输出为 500.

Then it started working and giving output as 500.

据我所知,python open 应该考虑utf8"作为默认编码.

As far as i know python open should consider "utf8" as default encoding.

我哪里出错了.

推荐答案

Python 3 的默认 UTF-8 编码仅扩展到 byte->str 转换.open() 而是使用您的环境来选择合适的编码:

The default UTF-8 encoding of Python 3 only extends to byte->str conversions. open() instead uses your environment to choose an appropriate encoding:

来自 Python 3 docs for open():

From the Python 3 docs for open():

encoding 是用于解码或编码文件的编码名称.这应该只在文本模式下使用.默认编码取决于平台(无论 locale.getpreferredencoding() 返回什么),但可以使用 Python 支持的任何文本编码.有关支持的编码列表,请参阅编解码器模块.

encoding is the name of the encoding used to decode or encode the file. This should only be used in text mode. The default encoding is platform dependent (whatever locale.getpreferredencoding() returns), but any text encoding supported by Python can be used. See the codecs module for the list of supported encodings.

就您而言,由于您使用的是西欧/北美的 Windows,您将获得 8 位 Windows-1252 字符集.将 encoding 设置为 utf-8 会覆盖这一点.

In your case, as you're on Windows with a Western Europe/North America, you will be given the 8bit Windows-1252 character set. Setting encoding to utf-8 overrides this.

这篇关于python 3.0 open() 默认编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆