带有 readlines() 方法的 Python3 UnicodeDecodeError [英] Python3 UnicodeDecodeError with readlines() method

查看:55
本文介绍了带有 readlines() 方法的 Python3 UnicodeDecodeError的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

尝试创建一个 twitter 机器人来读取线条并发布它们.通过我共享服务器空间上的 virtualenv 使用 Python3 和 tweepy.这是代码中似乎有问题的部分:

#!/foo/env/bin/python3进口重新导入 tweepy、时间、系统argfile = str(sys.argv[1])文件名=打开(argfile,'r')f=文件名.readlines()文件名.close()

这是我得到的错误:

UnicodeDecodeError: 'ascii' 编解码器无法解码位置 0 中的字节 0xfe:序号不在范围内 (128)

该错误明确指出 f=filename.readlines() 作为错误的来源.知道可能有什么问题吗?谢谢.

解决方案

我认为最好的答案(在 Python 3 中)是使用 errors= 参数:

 with open('evil_unicode.txt', 'r', errors='replace') as f:行 = f.readlines()

证明:

<预><代码>>>>s = b'\xe5abc\nline2\nline3'>>>with open('evil_unicode.txt','wb') as f:... f. 写...16>>>with open('evil_unicode.txt', 'r') as f:...行 = f.readlines()...回溯(最近一次调用最后一次):文件<stdin>",第 2 行,在 <module>文件/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/codecs.py",第 319 行,解码(结果,消耗)= self._buffer_decode(数据,self.errors,final)UnicodeDecodeError: 'utf-8' 编解码器无法解码位置 0 中的字节 0xe5:继续字节无效>>>with open('evil_unicode.txt', 'r', errors='replace') as f:...行 = f.readlines()...>>>线[' abc\n', 'line2\n', 'line3']>>>

请注意,errors= 可以是 replaceignore.下面是 ignore 的样子:

<预><代码>>>>with open('evil_unicode.txt', 'r', errors='ignore') as f:...行 = f.readlines()...>>>线['abc\n', 'line2\n', 'line3']

Trying to create a twitter bot that reads lines and posts them. Using Python3 and tweepy, via a virtualenv on my shared server space. This is the part of the code that seems to have trouble:

#!/foo/env/bin/python3

import re
import tweepy, time, sys

argfile = str(sys.argv[1])

filename=open(argfile, 'r')
f=filename.readlines()
filename.close()

this is the error I get:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xfe in position 0: ordinal not in range(128)

The error specifically points to f=filename.readlines() as the source of the error. Any idea what might be wrong? Thanks.

解决方案

I think the best answer (in Python 3) is to use the errors= parameter:

with open('evil_unicode.txt', 'r', errors='replace') as f:
    lines = f.readlines()

Proof:

>>> s = b'\xe5abc\nline2\nline3'
>>> with open('evil_unicode.txt','wb') as f:
...     f.write(s)
...
16
>>> with open('evil_unicode.txt', 'r') as f:
...     lines = f.readlines()
...
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/codecs.py", line 319, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe5 in position 0: invalid continuation byte
>>> with open('evil_unicode.txt', 'r', errors='replace') as f:
...     lines = f.readlines()
...
>>> lines
['�abc\n', 'line2\n', 'line3']
>>>

Note that the errors= can be replace or ignore. Here's what ignore looks like:

>>> with open('evil_unicode.txt', 'r', errors='ignore') as f:
...     lines = f.readlines()
...
>>> lines
['abc\n', 'line2\n', 'line3']

这篇关于带有 readlines() 方法的 Python3 UnicodeDecodeError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆