Python:UnicodeDecodeError:'utf8'编解码器无法解码字节0x91 [英] Python: UnicodeDecodeError: 'utf8' codec can't decode byte 0x91

查看:1928
本文介绍了Python:UnicodeDecodeError:'utf8'编解码器无法解码字节0x91的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在解析CSV,如下所示:

 以csvfile打开(args.csv,'rU' 
try:
reader = csv.DictReader(csvfile,dialect = csv.QUOTE_NONE)
对于阅读器中的行:
...

其中 args.csv 是我的文件的名称。我的文件中的一行是一个顶部有两个点的e。我的脚本在遇到这个时会中断。



我得到以下堆栈跟踪:

  /local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py,第244行,在转储
return _default_encoder.encode(obj)
文件/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py,第207行,在encode
chunks = self.iterencode(o,_one_shot = True)
文件/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/ encoder.py,第270行,在iterencode中
return _iterencode(o,0)



<和以下错误:

  UnicodeDecodeError:'utf8'编解码器无法解码位置5中的字节0x91:无效的起始字节

FWIW,我正在运行Python 2.7,升级不是一个选项/ p>

我很遗憾如何解决这个问题,所以任何帮助是非常感激。



谢谢! p>

解决方案

字节0x91是 Windows-1252 编码。所以它听起来像是你的文件正在使用的编码,而不是UTF-8。所以,使用 open(args.csv,'rU',encoding ='windows-1252')


I'm parsing a CSV as follows:

with open(args.csv, 'rU') as csvfile:
        try:
            reader = csv.DictReader(csvfile, dialect=csv.QUOTE_NONE)
            for row in reader:
            ...

where args.csv is the name of my file. One of the rows in my file is an e with two dots on top. My script breaks when it encounters this.

I get the following stack trace:

File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 244, in dumps
    return _default_encoder.encode(obj)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 207, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 270, in iterencode
    return _iterencode(o, 0)

and the following error:

UnicodeDecodeError: 'utf8' codec can't decode byte 0x91 in position 5: invalid start byte

FWIW, I'm running Python 2.7 and upgrading isn't an option (for a few reasons).

I'm pretty lost about how to fix this so any help is much appreciated.

Thanks!

解决方案

Byte 0x91 is a "smart" opening single quote in Windows-1252 encoding. So it sounds like that's the encoding your file is using, not UTF-8. So, use open(args.csv, 'rU', encoding='windows-1252').

这篇关于Python:UnicodeDecodeError:'utf8'编解码器无法解码字节0x91的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆