切换到Python 3导致UnicodeDecodeError [英] Switching to Python 3 causing UnicodeDecodeError

查看:684
本文介绍了切换到Python 3导致UnicodeDecodeError的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚加入了Sublime的Python3解释器,以下代码停止工作:

  for directoryList中的目录:
fileList = os.listdir(directory)
fileList中的文件名:
filename = os.path.join(目录,文件名)
currentFile = open(filename,'rt')
在currentFile中的行:##这里出现异常。
currentLine = line.split('')
currentLine中的单词:
如果word.lower()不在bigBagOfWords中:
bigBagOfWords.append(word.lower())
currentFile.close()

我收到以下异常:

 文件/Users/Kuba/Desktop/DictionaryCreator.py,第11行在< module> 
在currentFile中的行:
文件/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/encodings/ascii.py,第26行,解码
返回codecs.ascii_decode(input,self.errors)[0]
UnicodeDecodeError:'ascii'编解码器无法解码305位字节0xcc:ordinal不在范围(128)

我发现这个很奇怪,因为据我所知,Python3应该是支持utf-8的。更重要的是,相同的确切代码在Python2.7中没有任何问题。我已经阅读了关于添加环境变量 PYTHONIOENCODING ,但我尝试了 - 无济于事(但是,在OS X Mavericks中添加环境变量似乎并不容易,所以也许我在添加变量时出错了?我修改了/etc/launchd.conf)

解决方案

Python 3 >阅读时解码文本文件。默认编码取自 locale.getpreferredencoding(False) ,这显然是您的设置返回'ASCII'。查看 open()函数文档


在文本模式下,如果未指定编码,则使用的编码与平台相关: locale.getpreferredencoding(False)被调用以获取当前的区域设置编码。


在系统设置中,您应该使用显式编解码器打开文本文件:

  currentFile = open(filename,'rt'编码='latin1')

您在哪里设置 encoding 参数来匹配您正在阅读的文件。



Python 3支持UTF-8作为源代码的默认设置。 >

您可能希望在 Unicode中阅读Python 3和Unicode HOWTO ,它解释了源代码编码和读取和编写Unicode数据。


I've just added Python3 interpreter to Sublime, and the following code stopped working:

for directory in directoryList:
    fileList = os.listdir(directory)
    for filename in fileList:
        filename = os.path.join(directory, filename)
        currentFile = open(filename, 'rt')
        for line in currentFile:               ##Here comes the exception.
            currentLine = line.split(' ')
            for word in currentLine:
                if word.lower() not in bigBagOfWords:
                    bigBagOfWords.append(word.lower())
        currentFile.close()

I get a following exception:

  File "/Users/Kuba/Desktop/DictionaryCreator.py", line 11, in <module>
    for line in currentFile:
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xcc in position 305: ordinal not in range(128)

I found this rather strange, because as far as I know Python3 is supposed to support utf-8 everywhere. What's more, the same exact code works with no problems on Python2.7. I've read about adding environmental variable PYTHONIOENCODING, but I tried it - to no avail (however, it appears it is not that easy to add an environmental variable in OS X Mavericks, so maybe I did something wrong with adding the variable? I modidified /etc/launchd.conf)

解决方案

Python 3 decodes text files when reading. The default encoding is taken from locale.getpreferredencoding(False), which evidently for your setup returns 'ASCII'. See the open() function documenation:

In text mode, if encoding is not specified the encoding used is platform dependent: locale.getpreferredencoding(False) is called to get the current locale encoding.

Instead of relying on a system setting, you should open your text files using an explicit codec:

currentFile = open(filename, 'rt', encoding='latin1')

where you set the encoding parameter to match the file you are reading.

Python 3 supports UTF-8 as the default for source code.

You may want to read up on Python 3 and Unicode in the Unicode HOWTO, which explains both about source code encoding and reading and writing Unicode data.

这篇关于切换到Python 3导致UnicodeDecodeError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆