使用常规编码器使对象JSON可序列化 [英] Making object JSON serializable with regular encoder

查看:93
本文介绍了使用常规编码器使对象JSON可序列化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对JSON序列化自定义非序列化对象的常规方法是将json.JSONEncoder子类化,然后将自定义编码器传递给转储.

The regular way of JSON-serializing custom non-serializable objects is to subclass json.JSONEncoder and then pass a custom encoder to dumps.

通常看起来像这样:

class CustomEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, foo):
            return obj.to_json()

        return json.JSONEncoder.default(self, obj)

print json.dumps(obj, cls = CustomEncoder)

我想做的是用默认的编码器使某些东西可序列化.我环顾四周,但找不到任何东西. 我的想法是,编码器将在某些字段中确定json编码.与__str__类似.也许是__json__字段. python中有类似的东西吗?

What I'm trying to do, is to make something serializable with the default encoder. I looked around but couldn't find anything. My thought is that there would be some field in which the encoder looks at to determine the json encoding. Something similar to __str__. Perhaps a __json__ field. Is there something like this in python?

我想使正在制作的模块的一类能够对使用该包的每个人进行JSON序列化,而不必担心实现自己的[琐碎]自定义编码器.

I want to make one class of a module I'm making to be JSON serializable to everyone that uses the package without them worrying about implementing their own [trivial] custom encoders.

推荐答案

正如我在对您的问题的评论中所说的那样,在查看json模块的源代码后,它似乎没有根据自己的意愿进行操作.但是,可以通过称为 monkey-patching (请参阅问题 什么是猴子补丁? ). 这可以在软件包的__init__.py初始化脚本中完成,并且会影响所有后续的json模块序列化,因为模块通常只加载一次,结果缓存在sys.modules中.

As I said in a comment to your question, after looking at the json module's source code, it does not appear to lend itself to doing what you want. However the goal could be achieved by what is known as monkey-patching (see question What is a monkey patch?). This could be done in your package's __init__.py initialization script and would affect all subsequent json module serialization since modules are generally only loaded once and the result is cached in sys.modules.

此修补程序更改了默认的json编码器的default方法(默认的default()).

The patch changes the default json encoder's default method—the default default().

为简单起见,以下示例被实现为独立模块:

Here's an example implemented as a standalone module for simplicity's sake:

模块:make_json_serializable.py

""" Module that monkey-patches json module when it's imported so
JSONEncoder.default() automatically checks for a special "to_json()"
method and uses it to encode the object if found.
"""
from json import JSONEncoder

def _default(self, obj):
    return getattr(obj.__class__, "to_json", _default.default)(obj)

_default.default = JSONEncoder.default  # Save unmodified default.
JSONEncoder.default = _default # Replace it.

使用此补丁很简单,因为只需导入模块即可应用补丁.

Using it is trivial since the patch is applied by simply importing the module.

示例客户端脚本:

import json
import make_json_serializable  # apply monkey-patch

class Foo(object):
    def __init__(self, name):
        self.name = name
    def to_json(self):  # New special method.
        """ Convert to JSON format string representation. """
        return '{"name": "%s"}' % self.name

foo = Foo('sazpaz')
print(json.dumps(foo))  # -> "{\"name\": \"sazpaz\"}"

要保留对象类型信息,特殊方法还可以将其包含在返回的字符串中:

To retain the object type information, the special method can also include it in the string returned:

        return ('{"type": "%s", "name": "%s"}' %
                 (self.__class__.__name__, self.name))

哪个会生成以下现在包含类名称的JSON:

Which produces the following JSON that now includes the class name:

"{\"type\": \"Foo\", \"name\": \"sazpaz\"}"

Magick躺在这里

比替换default()寻找一个特别命名的方法要好,因为它能够自动自动序列化大多数Python对象,包括用户定义的类实例,而无需添加一种特殊的方法.在研究了许多替代方案之后,以下使用pickle模块的模块对我来说似乎最接近该理想方案:

Magick Lies Here

Even better than having the replacement default() look for a specially named method, would be for it to be able to serialize most Python objects automatically, including user-defined class instances, without needing to add a special method. After researching a number of alternatives, the following which uses the pickle module, seemed closest to that ideal to me:

模块:make_json_serializable2.py

""" Module that imports the json module and monkey-patches it so
JSONEncoder.default() automatically pickles any Python objects
encountered that aren't standard JSON data types.
"""
from json import JSONEncoder
import pickle

def _default(self, obj):
    return {'_python_object': pickle.dumps(obj)}

JSONEncoder.default = _default  # Replace with the above.

当然,所有内容都不能被腌制-例如扩展名.但是,有一些方法定义了通过pickle协议通过编写特殊方法来处理它们的方法(类似于您之前和我所描述的方法),但是这样做的情况可能要少得多.

Of course everything can't be pickled—extension types for example. However there are ways defined to handle them via the pickle protocol by writing special methods—similar to what you suggested and I described earlier—but doing that would likely be necessary for a far fewer number of cases.

无论如何,使用pickle协议还意味着,通过在传入的字典中使用任何'_python_object'键的任何json.loads()调用上提供自定义的object_hook函数参数,来重构原始Python对象将相当容易,只要有一个.像这样:

Regardless, using the pickle protocol also means it would be fairly easy to reconstruct the original Python object by providing a custom object_hook function argument on any json.loads() calls that used any '_python_object' key in the dictionary passed in, whenever it has one. Something like:

def as_python_object(dct):
    try:
        return pickle.loads(str(dct['_python_object']))
    except KeyError:
        return dct

pyobj = json.loads(json_str, object_hook=as_python_object)

如果必须在许多地方执行此操作,则可能值得定义一个包装器函数,该函数自动提供额外的关键字参数:

If this has to be done in many places, it might be worthwhile to define a wrapper function that automatically supplied the extra keyword argument:

json_pkloads = functools.partial(json.loads, object_hook=as_python_object)

pyobj = json_pkloads(json_str)

自然,也可以将其猴子修补到json模块中,从而使该函数成为默认的object_hook(而不是None).

Naturally, this could be monkey-patched it into the json module as well, making the function the default object_hook (instead of None).

我从 answer 中获得了使用pickle的想法.com/users/1001643/raymond-hettinger> Raymond Hettinger 到另一个JSON序列化问题,我认为该问题非常可信,并且是官方消息来源(例如Python核心开发人员).

I got the idea for using pickle from an answer by Raymond Hettinger to another JSON serialization question, whom I consider exceptionally credible as well as an official source (as in Python core developer).

上面的代码无法如Python 3所示工作,因为json.dumps()返回了bytes无法处理的bytes对象.但是,该方法仍然有效.解决此问题的一种简单方法是latin1解码"从pickle.dumps()返回的值,然后从latin1编码"它,然后再将其传递给as_python_object()函数中的pickle.loads().之所以有效,是因为任意二进制字符串都是有效的latin1,始终可以将其解码为Unicode,然后再次编码回原始字符串(如 Sven Marnach )的答案.

The code above does not work as shown in Python 3 because json.dumps() returns a bytes object which the JSONEncoder can't handle. However the approach is still valid. A simple way to workaround the issue is to latin1 "decode" the value returned from pickle.dumps() and then "encode" it from latin1 before passing it on to pickle.loads() in the as_python_object() function. This works because arbitrary binary strings are valid latin1 which can always be decoded to Unicode and then encoded back to the original string again (as pointed out in this answer by Sven Marnach).

(尽管以下内容在Python 2中可以正常工作,但latin1对其进行的解码和编码是多余的.)

(Although the following works fine in Python 2, the latin1 decoding and encoding it does is superfluous.)

from decimal import Decimal

class PythonObjectEncoder(json.JSONEncoder):
    def default(self, obj):
        return {'_python_object': pickle.dumps(obj).decode('latin1')}

def as_python_object(dct):
    try:
        return pickle.loads(dct['_python_object'].encode('latin1'))
    except KeyError:
        return dct

data = [1,2,3, set(['knights', 'who', 'say', 'ni']), {'key':'value'},
        Decimal('3.14')]
j = json.dumps(data, cls=PythonObjectEncoder, indent=4)
data2 = json.loads(j, object_hook=as_python_object)
assert data == data2  # both should be same

这篇关于使用常规编码器使对象JSON可序列化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆