如何从JSON获取字符串对象而不是Unicode? [英] How to get string objects instead of Unicode from JSON?

查看：180 发布时间：2019/11/23 16:22:14 python json serialization unicode python-2.x

本文介绍了如何从JSON获取字符串对象而不是Unicode?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用 Python 2 从 ASCII编码文本文件中解析JSON.

I'm using Python 2 to parse JSON from ASCII encoded text files.

使用 json 或 simplejson ，我所有的字符串值都转换为Unicode对象而不是字符串对象.问题是，我必须将数据与仅接受字符串对象的某些库一起使用.我无法更改库，也无法对其进行更新.

When loading these files with either json or simplejson, all my string values are cast to Unicode objects instead of string objects. The problem is, I have to use the data with some libraries that only accept string objects. I can't change the libraries nor update them.

是否可以获取字符串对象而不是Unicode对象?

Is it possible to get string objects instead of Unicode ones?

>>> import json
>>> original_list = ['a', 'b']
>>> json_list = json.dumps(original_list)
>>> json_list
'["a", "b"]'
>>> new_list = json.loads(json_list)
>>> new_list
[u'a', u'b']  # I want these to be of type `str`, not `unicode`

更新

很久以前，当我坚持使用 Python 2 时，这个问题就被问到了.对于当今而言，一种简单易用的解决方案是使用最新版本的Python，即 Python 3 及更高版本.

Update

This question was asked a long time ago, when I was stuck with Python 2. One easy and clean solution for today is to use a recent version of Python — i.e. Python 3 and forward.

使用`object_hook`

的解决方案

A solution with `object_hook`

import json

def json_load_byteified(file_handle):
    return _byteify(
        json.load(file_handle, object_hook=_byteify),
        ignore_dicts=True
    )

def json_loads_byteified(json_text):
    return _byteify(
        json.loads(json_text, object_hook=_byteify),
        ignore_dicts=True
    )

def _byteify(data, ignore_dicts = False):
    # if this is a unicode string, return its string representation
    if isinstance(data, unicode):
        return data.encode('utf-8')
    # if this is a list of values, return list of byteified values
    if isinstance(data, list):
        return [ _byteify(item, ignore_dicts=True) for item in data ]
    # if this is a dictionary, return dictionary of byteified keys and values
    # but only if we haven't already byteified it
    if isinstance(data, dict) and not ignore_dicts:
        return {
            _byteify(key, ignore_dicts=True): _byteify(value, ignore_dicts=True)
            for key, value in data.iteritems()
        }
    # if it's anything else, return it in its original form
    return data

示例用法:

>>> json_loads_byteified('{"Hello": "World"}')
{'Hello': 'World'}
>>> json_loads_byteified('"I am a top-level string"')
'I am a top-level string'
>>> json_loads_byteified('7')
7
>>> json_loads_byteified('["I am inside a list"]')
['I am inside a list']
>>> json_loads_byteified('[[[[[[[["I am inside a big nest of lists"]]]]]]]]')
[[[[[[[['I am inside a big nest of lists']]]]]]]]
>>> json_loads_byteified('{"foo": "bar", "things": [7, {"qux": "baz", "moo": {"cow": ["milk"]}}]}')
{'things': [7, {'qux': 'baz', 'moo': {'cow': ['milk']}}], 'foo': 'bar'}
>>> json_load_byteified(open('somefile.json'))
{'more json': 'from a file'}

这是如何工作的，我为什么要使用它?

Mark Amery的功能比这些功能更短更清晰，那么它们的意义何在?您为什么要使用它们?

How does this work and why would I use it?

Mark Amery's function is shorter and clearer than these ones, so what's the point of them? Why would you want to use them?

纯粹是为了获得效果. Mark的答案首先使用Unicode字符串完全解码JSON文本，然后遍历整个解码值以将所有字符串转换为字节字符串.这会带来一些不良影响:

Purely for performance. Mark's answer decodes the JSON text fully first with unicode strings, then recurses through the entire decoded value to convert all strings to byte strings. This has a couple of undesirable effects:

在内存中创建了整个解码结构的副本
如果您的JSON对象是 really 深度嵌套(500个级别或更多)，则您将达到Python的最大递归深度

A copy of the entire decoded structure gets created in memory
If your JSON object is really deeply nested (500 levels or more) then you'll hit Python's maximum recursion depth

此答案通过使用json.load和json.loads的object_hook参数来缓解这两个性能问题.来自文档:

This answer mitigates both of those performance issues by using the object_hook parameter of json.load and json.loads. From the docs:

object_hook是一个可选函数，它将被解码的任何对象文字(a dict)的结果调用.将使用object_hook的返回值代替dict.此功能可用于实现自定义解码器

object_hook is an optional function that will be called with the result of any object literal decoded (a dict). The return value of object_hook will be used instead of the dict. This feature can be used to implement custom decoders

由于在其他字典中嵌套了许多层次的字典在解码时传递给了object_hook ，因此我们可以在此时对其中的任何字符串或列表进行字节化，而无需进行深度递归以后.

Since dictionaries nested many levels deep in other dictionaries get passed to object_hook as they're decoded, we can byteify any strings or lists inside them at that point and avoid the need for deep recursion later.

Mark的答案不适合用作object_hook，因为它递归为嵌套词典.我们在_byteify参数中使用ignore_dicts参数来防止该递归，当object_hook将其传递给新的dict进行字节化时，该参数将始终传递给 except . ignore_dicts标志告诉_byteify忽略dict，因为它们已经被字节化了.

Mark's answer isn't suitable for use as an object_hook as it stands, because it recurses into nested dictionaries. We prevent that recursion in this answer with the ignore_dicts parameter to _byteify, which gets passed to it at all times except when object_hook passes it a new dict to byteify. The ignore_dicts flag tells _byteify to ignore dicts since they already been byteified.

最后，我们的json_load_byteified和json_loads_byteified的实现对json.load或json.loads返回的结果调用_byteify(带有ignore_dicts=True)来处理解码的JSON文本不正确的情况在顶层有dict.

Finally, our implementations of json_load_byteified and json_loads_byteified call _byteify (with ignore_dicts=True) on the result returned from json.load or json.loads to handle the case where the JSON text being decoded doesn't have a dict at the top level.

这篇关于如何从JSON获取字符串对象而不是Unicode?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何从JSON获取字符串对象而不是Unicode? [英] How to get string objects instead of Unicode from JSON?

问题描述

更新

Update

推荐答案

使用`object_hook`

A solution with `object_hook`

这是如何工作的，我为什么要使用它?

How does this work and why would I use it?

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何从JSON获取字符串对象而不是Unicode? [英] How to get string objects instead of Unicode from JSON?

问题描述

更新

Update

推荐答案

使用object_hook

A solution with object_hook

这是如何工作的，我为什么要使用它?

How does this work and why would I use it?

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

使用`object_hook`

A solution with `object_hook`

登录关闭