带有 readlines() 方法的 Python3 UnicodeDecodeError [英] Python3 UnicodeDecodeError with readlines() method
问题描述
尝试创建一个 twitter 机器人来读取线条并发布它们.通过我共享服务器空间上的 virtualenv 使用 Python3 和 tweepy.这是代码中似乎有问题的部分:
#!/foo/env/bin/python3进口重新导入 tweepy、时间、系统argfile = str(sys.argv[1])文件名=打开(argfile,'r')f=文件名.readlines()文件名.close()
这是我得到的错误:
UnicodeDecodeError: 'ascii' 编解码器无法解码位置 0 中的字节 0xfe:序号不在范围内 (128)
该错误明确指出 f=filename.readlines()
作为错误的来源.知道可能有什么问题吗?谢谢.
我认为最好的答案(在 Python 3 中)是使用 errors=
参数:
with open('evil_unicode.txt', 'r', errors='replace') as f:行 = f.readlines()
证明:
<预><代码>>>>s = b'\xe5abc\nline2\nline3'>>>with open('evil_unicode.txt','wb') as f:... f. 写...16>>>with open('evil_unicode.txt', 'r') as f:...行 = f.readlines()...回溯(最近一次调用最后一次):文件<stdin>",第 2 行,在 <module>文件/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/codecs.py",第 319 行,解码(结果,消耗)= self._buffer_decode(数据,self.errors,final)UnicodeDecodeError: 'utf-8' 编解码器无法解码位置 0 中的字节 0xe5:继续字节无效>>>with open('evil_unicode.txt', 'r', errors='replace') as f:...行 = f.readlines()...>>>线[' abc\n', 'line2\n', 'line3']>>>请注意,errors=
可以是 replace
或 ignore
.下面是 ignore
的样子:
Trying to create a twitter bot that reads lines and posts them. Using Python3 and tweepy, via a virtualenv on my shared server space. This is the part of the code that seems to have trouble:
#!/foo/env/bin/python3
import re
import tweepy, time, sys
argfile = str(sys.argv[1])
filename=open(argfile, 'r')
f=filename.readlines()
filename.close()
this is the error I get:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xfe in position 0: ordinal not in range(128)
The error specifically points to f=filename.readlines()
as the source of the error. Any idea what might be wrong? Thanks.
I think the best answer (in Python 3) is to use the errors=
parameter:
with open('evil_unicode.txt', 'r', errors='replace') as f:
lines = f.readlines()
Proof:
>>> s = b'\xe5abc\nline2\nline3'
>>> with open('evil_unicode.txt','wb') as f:
... f.write(s)
...
16
>>> with open('evil_unicode.txt', 'r') as f:
... lines = f.readlines()
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/codecs.py", line 319, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe5 in position 0: invalid continuation byte
>>> with open('evil_unicode.txt', 'r', errors='replace') as f:
... lines = f.readlines()
...
>>> lines
['�abc\n', 'line2\n', 'line3']
>>>
Note that the errors=
can be replace
or ignore
. Here's what ignore
looks like:
>>> with open('evil_unicode.txt', 'r', errors='ignore') as f:
... lines = f.readlines()
...
>>> lines
['abc\n', 'line2\n', 'line3']
这篇关于带有 readlines() 方法的 Python3 UnicodeDecodeError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!