UnicodeDecodeError: 'utf8' 编解码器无法解码位置 3-6 中的字节:无效数据 [英] UnicodeDecodeError: 'utf8' codec can't decode bytes in position 3-6: invalid data
问题描述
unicode 在 python2 上是如何工作的?我就是不明白.
how does the unicode thing works on python2? i just dont get it.
在这里,我从服务器下载数据并将其解析为 JSON.
here i download data from a server and parse it for JSON.
Traceback (most recent call last):
File "/usr/local/lib/python2.6/dist-packages/eventlet-0.9.12-py2.6.egg/eventlet/hubs/poll.py", line 92, in wait
readers.get(fileno, noop).cb(fileno)
File "/usr/local/lib/python2.6/dist-packages/eventlet-0.9.12-py2.6.egg/eventlet/greenthread.py", line 202, in main
result = function(*args, **kwargs)
File "android_suggest.py", line 60, in fetch
suggestions = suggest(chars)
File "android_suggest.py", line 28, in suggest
return [i['s'] for i in json.loads(opener.open('https://market.android.com/suggest/SuggRequest?json=1&query='+s+'&hl=de&gl=DE').read())]
File "/usr/lib/python2.6/json/__init__.py", line 307, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.6/json/decoder.py", line 319, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.6/json/decoder.py", line 336, in raw_decode
obj, end = self._scanner.iterscan(s, **kw).next()
File "/usr/lib/python2.6/json/scanner.py", line 55, in iterscan
rval, next_pos = action(m, context)
File "/usr/lib/python2.6/json/decoder.py", line 217, in JSONArray
value, end = iterscan(s, idx=end, context=context).next()
File "/usr/lib/python2.6/json/scanner.py", line 55, in iterscan
rval, next_pos = action(m, context)
File "/usr/lib/python2.6/json/decoder.py", line 183, in JSONObject
value, end = iterscan(s, idx=end, context=context).next()
File "/usr/lib/python2.6/json/scanner.py", line 55, in iterscan
rval, next_pos = action(m, context)
File "/usr/lib/python2.6/json/decoder.py", line 155, in JSONString
return scanstring(match.string, match.end(), encoding, strict)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 3-6: invalid data
谢谢!!
以下字符串导致错误:'[{"t":"q","s":"abh\xf6ren"}]'
.\xf6
应该被解码为 ö
(abhören)
the following string causes the error: '[{"t":"q","s":"abh\xf6ren"}]'
. \xf6
should be decoded to ö
(abhören)
推荐答案
您尝试解析为 JSON 的字符串未采用 UTF-8 编码.很可能它是用 ISO-8859-1 编码的.请尝试以下操作:
The string you're trying to parse as a JSON is not encoded in UTF-8. Most likely it is encoded in ISO-8859-1. Try the following:
json.loads(unicode(opener.open(...), "ISO-8859-1"))
这将处理可能出现在 JSON 消息中的任何变音.
That will handle any umlauts that might get in the JSON message.
您应该阅读 Joel Spolsky 的 每个软件开发人员绝对、肯定必须了解的绝对最低要求套装(没有借口!).我希望它能澄清您在 Unicode 方面遇到的一些问题.
You should read Joel Spolsky's The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!). I hope that it will clarify some issues you're having around Unicode.
这篇关于UnicodeDecodeError: 'utf8' 编解码器无法解码位置 3-6 中的字节:无效数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!