UnicodeDecodeError: 'utf8' 编解码器无法解码位置 3-6 中的字节:无效数据 [英] UnicodeDecodeError: 'utf8' codec can't decode bytes in position 3-6: invalid data

查看:87
本文介绍了UnicodeDecodeError: 'utf8' 编解码器无法解码位置 3-6 中的字节:无效数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

unicode 在 python2 上是如何工作的?我就是不明白.

how does the unicode thing works on python2? i just dont get it.

在这里,我从服务器下载数据并将其解析为 JSON.

here i download data from a server and parse it for JSON.

Traceback (most recent call last):
  File "/usr/local/lib/python2.6/dist-packages/eventlet-0.9.12-py2.6.egg/eventlet/hubs/poll.py", line 92, in wait
    readers.get(fileno, noop).cb(fileno)
  File "/usr/local/lib/python2.6/dist-packages/eventlet-0.9.12-py2.6.egg/eventlet/greenthread.py", line 202, in main
    result = function(*args, **kwargs)
  File "android_suggest.py", line 60, in fetch
    suggestions = suggest(chars)
  File "android_suggest.py", line 28, in suggest
    return [i['s'] for i in json.loads(opener.open('https://market.android.com/suggest/SuggRequest?json=1&query='+s+'&hl=de&gl=DE').read())]
  File "/usr/lib/python2.6/json/__init__.py", line 307, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python2.6/json/decoder.py", line 319, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python2.6/json/decoder.py", line 336, in raw_decode
    obj, end = self._scanner.iterscan(s, **kw).next()
  File "/usr/lib/python2.6/json/scanner.py", line 55, in iterscan
    rval, next_pos = action(m, context)
  File "/usr/lib/python2.6/json/decoder.py", line 217, in JSONArray
    value, end = iterscan(s, idx=end, context=context).next()
  File "/usr/lib/python2.6/json/scanner.py", line 55, in iterscan
    rval, next_pos = action(m, context)
  File "/usr/lib/python2.6/json/decoder.py", line 183, in JSONObject
    value, end = iterscan(s, idx=end, context=context).next()
  File "/usr/lib/python2.6/json/scanner.py", line 55, in iterscan
    rval, next_pos = action(m, context)
  File "/usr/lib/python2.6/json/decoder.py", line 155, in JSONString
    return scanstring(match.string, match.end(), encoding, strict)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 3-6: invalid data

谢谢!!

以下字符串导致错误:'[{"t":"q","s":"abh\xf6ren"}]'.\xf6 应该被解码为 ö (abhören)

the following string causes the error: '[{"t":"q","s":"abh\xf6ren"}]'. \xf6 should be decoded to ö (abhören)

推荐答案

您尝试解析为 JSON 的字符串未采用 UTF-8 编码.很可能它是用 ISO-8859-1 编码的.请尝试以下操作:

The string you're trying to parse as a JSON is not encoded in UTF-8. Most likely it is encoded in ISO-8859-1. Try the following:

json.loads(unicode(opener.open(...), "ISO-8859-1"))

这将处理可能出现在 JSON 消息中的任何变音.

That will handle any umlauts that might get in the JSON message.

您应该阅读 Joel Spolsky 的 每个软件开发人员绝对、肯定必须了解的绝对最低要求套装(没有借口!).我希望它能澄清您在 Unicode 方面遇到的一些问题.

You should read Joel Spolsky's The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!). I hope that it will clarify some issues you're having around Unicode.

这篇关于UnicodeDecodeError: 'utf8' 编解码器无法解码位置 3-6 中的字节:无效数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆