python 3.0 open()默认编码 [英] python 3.0 open() default encoding

查看:955
本文介绍了python 3.0 open()默认编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试计算JSON文件中的行数. 点击此处访问我的JSON文件.

I am trying to count the lines in a JSON file. Click HERE to access my JSON file .

我试图用下面的代码计算行数.

I tried to use the below code to count the lines.

input = open("json/world_bank.json")
i=0
for l in input:
    i+=1
print(i)

但是上面的代码抛出了UniCodeDecode错误,如下所示.

But the above code is throwing a UniCodeDecode Error as shown below.

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-17-edc88ade7225> in <module>()
      2 
      3 i=0
----> 4 for l in input:
      5     i+=1
      6 

C:\Users\Subbi Reddy\AppData\Local\Continuum\Anaconda3\lib\encodings\cp1252.py in decode(self, input, final)
     21 class IncrementalDecoder(codecs.IncrementalDecoder):
     22     def decode(self, input, final=False):
---> 23         return codecs.charmap_decode(input,self.errors,decoding_table)[0]
     24 
     25 class StreamWriter(Codec,codecs.StreamWriter):

UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 3979: character maps to <undefined>

然后我在open函数中包括了编码参数,如下所示.

Then i included encoding parameter in open function as shown below.

input = open("json/world_bank.json",encoding="utf8")

然后它开始工作并输出500.

Then it started working and giving output as 500.

据我所知,python open应该考虑将"utf8"作为默认编码.

As far as i know python open should consider "utf8" as default encoding.

我在这里哪里出错了.

推荐答案

Python 3的默认UTF-8编码仅扩展到byte-> str转换. open()而是使用您的环境来选择适当的编码:

The default UTF-8 encoding of Python 3 only extends to byte->str conversions. open() instead uses your environment to choose an appropriate encoding:

从Python 3 docs 中获取open():

From the Python 3 docs for open():

encoding是用于对文件进行解码或编码的编码的名称.仅应在文本模式下使用.默认编码取决于平台(无论locale.getpreferredencoding()返回什么),但是可以使用Python支持的任何文本编码.有关支持的编码列表,请参见编解码器模块.

encoding is the name of the encoding used to decode or encode the file. This should only be used in text mode. The default encoding is platform dependent (whatever locale.getpreferredencoding() returns), but any text encoding supported by Python can be used. See the codecs module for the list of supported encodings.

在您的情况下,就像您在使用西欧/北美的Windows上一样,系统会为您提供8位Windows-1252字符集.将encoding设置为utf-8会对此进行覆盖.

In your case, as you're on Windows with a Western Europe/North America, you will be given the 8bit Windows-1252 character set. Setting encoding to utf-8 overrides this.

这篇关于python 3.0 open()默认编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆