Python JSON和Unicode [英] Python JSON and Unicode

查看:72
本文介绍了Python JSON和Unicode的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在这里找到了答案: Python UnicodeDecodeError-我误解了编码?

I found the answer here : Python UnicodeDecodeError - Am I misunderstanding encode?

在读取文件时,我需要明确地将输入文件解码 Unicode.因为它的字符既不是ASCII也不是Unicode.因此,当碰到这些字符时,编码会失败.

I needed to explicitly decode my incoming file into Unicode when I read it. Because it had characters that were neither acceptable ascii nor unicode. So the encode was failing when it hit these characters.

所以,我知道有些事情我只是没来这里.

So, I know there's something I'm just not getting here.

我有一个unicode字符串数组,其中一些包含非Ascii字符.

I have an array of unicode strings, some of which contain non-Ascii characters.

我想用

json.dumps(myList)

它抛出一个错误

UnicodeDecodeError: 'ascii' codec can't decode byte 0xb4 in position 13: ordinal not in range(128)

我应该怎么做?我尝试将suresure_ascii参数设置为True和False,但都不能解决此问题.

How am I supposed to do this? I've tried setting the ensure_ascii parameter to both True and False, but neither fixes this problem.

我知道我正在将unicode字符串传递给json.dumps.我知道json字符串是unicode.为什么不为我整理一下呢?

I know I'm passing unicode strings to json.dumps. I understand that a json string is meant to be unicode. Why isn't it just sorting this out for me?

我在做什么错了?

更新:Don Question明智地建议我提供一个堆栈跟踪.这里是. :

Update : Don Question sensibly suggests I provide a stack-trace. Here it is. :

Traceback (most recent call last):
  File "importFiles.py", line 69, in <module>
    x = u"%s" % conv
  File "importFiles.py", line 62, in __str__
    return self.page.__str__()
  File "importFiles.py", line 37, in __str__
    return json.dumps(self.page(),ensure_ascii=False)
  File "/usr/lib/python2.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/usr/lib/python2.7/json/encoder.py", line 204, in encode
    return ''.join(chunks)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xb4 in position 17: ordinal not in range(128)

请注意,它是python 2.7,并且suresure_ascii = False仍会发生错误

Note it's python 2.7, and the error is still occurring with ensure_ascii=False

更新2:安德鲁·沃克(Andrew Walker)的有用链接(在注释中)使我认为,在尝试对json.encode进行编码之前,可以将自己的数据强制转换为方便的字节格式:

Update 2 : Andrew Walker's useful link (in comments) leads me to think I can coerce my data into a convenient byte format before trying to json.encode it by doing something like :

data.encode("ascii","ignore")

不幸的是,这引发了同样的错误.

Unfortunately that is throwing the same error.

推荐答案

尝试添加参数:ensure_ascii = False.同样,特别是在询问与unicode相关的问题时,添加更长的(完整的)回溯并说明您正在使用的python版本非常有用.

Try adding the argument: ensure_ascii = False. Also especially if asking unicode-related issues it's very helpful to add a longer (complete) traceback and stating which python-version you are using.

引用python文档:

Citing the python-documentation: of version 2.6.7 :

如果sure_ascii为False(默认值:True),则将一些块写入 fp可能是unicode实例,但受普通Python str转换为unicode的限制 强制规则.除非fp.write()明确了解unicode(如 在codecs.getwriter()中,这很可能会导致错误."

"If ensure_ascii is False (default: True), then some chunks written to fp may be unicode instances, subject to normal Python str to unicode coercion rules. Unless fp.write() explicitly understands unicode (as in codecs.getwriter()) this is likely to cause an error."

因此,该提案可能会引起新的问题,但它解决了我遇到的类似问题.我将生成的unicode-String输入到StringIO对象中,并将其写到文件中.

So this proposal may cause new problems, but it fixed a similar problem i had. I fed the resulting unicode-String into a StringIO-object and wrrote this to a file.

由于python 2.7和sys.getdefaultencoding设置为ascii,如果chunks没有经过ascii编码,则通过json-standard-library的''.join(chunks)语句进行的隐式转换将会爆炸!您必须事先确保将任何包含的字符串转换为与ASCII兼容的表示形式!您可以尝试utf-8编码的字符串,但是如果我没记错的话,unicode字符串将无法工作.

Because of python 2.7 and sys.getdefaultencoding set to ascii the implicit conversion through the ''.join(chunks) statement of the json-standard-library will blow up if chunks is not ascii-encoded! You must ensure that any contained strings are converted to an ascii-compatible representation before-hand! You may try utf-8 encoded strings, but unicode-strings won't work if i'm not mistaken.

这篇关于Python JSON和Unicode的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆