使用常规编码器使对象 JSON 可序列化 [英] Making object JSON serializable with regular encoder

查看:25
本文介绍了使用常规编码器使对象 JSON 可序列化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

JSON 序列化自定义不可序列化对象的常规方法是子类化 json.JSONEncoder,然后将自定义编码器传递给 json.dumps().>

通常是这样的:

class CustomEncoder(json.JSONEncoder):定义默认(自我,对象):如果 isinstance(obj, Foo):返回 obj.to_json()返回 json.JSONEncoder.default(self, obj)打印(json.dumps(obj,cls=CustomEncoder))

我想要做的是使用默认编码器制作一些可序列化的东西.我环顾四周,但找不到任何东西.我的想法是编码器会查看某些字段以确定 json 编码.类似于 __str__ 的东西.也许是一个 __json__ 字段.python中有没有类似的东西?

我想制作一个模块的一个类,我正在制作的模块可以 JSON 序列化给使用该包的每个人,而他们不必担心实现他们自己的 [琐碎] 自定义编码器.

解决方案

正如我在对你的问题的评论中所说的,在查看了 json 模块的源代码后,它似乎并不适合做你想做的事.但是,可以通过所谓的monkey-patching(请参阅问题什么是猴子补丁?).这可以在包的 __init__.py 初始化脚本中完成,并且会影响所有后续的 json 模块序列化,因为模块通常只加载一次并且结果缓存在 中sys.modules.

补丁改变了默认的json编码器的default方法——默认的default().

为简单起见,以下是作为独立模块实现的示例:

模块:make_json_serializable.py

"""导入时对 json 模块进行猴子修补的模块JSONEncoder.default() 自动检查特殊的to_json()";方法并使用它对找到的对象进行编码."从 json 导入 JSONEncoderdef _default(self, obj):返回 getattr(obj.__class__, "to_json", _default.default)(obj)_default.default = JSONEncoder.default # 保存未修改的默认值.JSONEncoder.default = _default # 替换它.

使用它很简单,因为只需导入模块即可应用补丁.

示例客户端脚本:

导入jsonimport make_json_serializable # 应用猴子补丁类 Foo(对象):def __init__(self, name):self.name = 姓名def to_json(self): # 新的特殊方法."转换为 JSON 格式的字符串表示."返回 '​​{name":%s"}' % self.namefoo = Foo('sazpaz')打印(json.dumps(foo))# ->"{"name": "sazpaz"}"

为了保留对象类型信息,特殊方法也可以将其包含在返回的字符串中:

 return ('{"type": "%s", "name": "%s"}' %(self.__class__.__name__, self.name))

生成以下现在包含类名的 JSON:

"{"type": "Foo", "name": "sazpaz";}"

魔法就在这里

甚至比替换 default() 寻找一个特别命名的方法更好,因为它能够自动序列化大多数 Python 对象,包括用户-定义类实例,无需添加特殊方法.在研究了许多替代方案后,以下内容——基于@Raymond Hettinger 对另一个问题的答案——使用了pickle 模块,对我来说似乎最接近那个理想:

模块:make_json_serializable2.py

"""导入 json 模块并对其进行猴子补丁的模块JSONEncoder.default() 自动腌制任何 Python 对象遇到非标准 JSON 数据类型."从 json 导入 JSONEncoder进口泡菜def _default(self, obj):返回 {'_python_object': pickle.dumps(obj)}JSONEncoder.default = _default # 替换为上面的.

当然,一切都不能被腌制——例如扩展类型.然而,有一些方法可以通过编写特殊方法来通过 pickle 协议来处理它们——类似于你所建议的和我之前描述的——但这样做可能对少得多的情况来说是必要的.

反序列化

无论如何,使用pickle协议也意味着通过在任何json.loads()上提供自定义object_hook函数参数来重构原始Python对象将相当容易使用传入的字典中的任何 '_python_object' 键的调用,只要它有一个.类似的东西:

def as_python_object(dct):尝试:返回pickle.loads(str(dct['_python_object']))除了 KeyError:返回 DCTpyobj = json.loads(json_str, object_hook=as_python_object)

如果这必须在很多地方完成,那么定义一个自动提供额外关键字参数的包装函数可能是值得的:

json_pkloads = functools.partial(json.loads, object_hook=as_python_object)pyobj = json_pkloads(json_str)

当然,这也可以通过猴子补丁将其添加到 json 模块中,使该函数成为默认的 object_hook(而不是 None).

我从 answer 中获得了使用 pickle 的想法 Raymond Hettinger 到另一个 JSON 序列化问题,我认为这个问题非常可信并且是官方来源(如 Python 核心开发人员).>

可移植到 Python 3

上面的代码在 Python 3 中不起作用,因为 json.dumps() 返回一个 bytes 对象,JSONEncoder 可以'处理.然而,该方法仍然有效.解决此问题的一种简单方法是 latin1 decode";从 pickle.dumps() 返回的值,然后编码"它从 latin1 传递到 as_python_object() 函数中的 pickle.loads() 之前.这是有效的,因为任意二进制字符串都是有效的 latin1,它总是可以解码为 Unicode,然后再次编码回原始字符串(如 这个答案 来自 Sven Marnach).

(尽管以下在 Python 2 中工作正常,latin1 解码和编码是多余的.)

from十进制导入十进制类 PythonObjectEncoder(json.JSONEncoder):定义默认(自我,对象):返回 {'_python_object': pickle.dumps(obj).decode('latin1')}def as_python_object(dct):尝试:return pickle.loads(dct['_python_object'].encode('latin1'))除了 KeyError:返回 DCTclass Foo(object): # 一些用户定义的类.def __init__(self, name):self.name = 姓名def __eq __(自己,其他):if type(other) is type(self): # 同一个类的实例?返回 self.name == other.name返回 NotImplemented__hash__ = 无data = [1,2,3, set(['knights', 'who', 'say', 'ni']), {'key':'value'},Foo('Bar'), Decimal('3.141592653589793238462643383279502884197169')]j = json.dumps(data, cls=PythonObjectEncoder, indent=4)data2 = json.loads(j, object_hook=as_python_object)assert data == data2 # 两者应该相同

The regular way of JSON-serializing custom non-serializable objects is to subclass json.JSONEncoder and then pass a custom encoder to json.dumps().

It usually looks like this:

class CustomEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, Foo):
            return obj.to_json()

        return json.JSONEncoder.default(self, obj)

print(json.dumps(obj, cls=CustomEncoder))

What I'm trying to do, is to make something serializable with the default encoder. I looked around but couldn't find anything. My thought is that there would be some field in which the encoder looks at to determine the json encoding. Something similar to __str__. Perhaps a __json__ field. Is there something like this in python?

I want to make one class of a module I'm making to be JSON serializable to everyone that uses the package without them worrying about implementing their own [trivial] custom encoders.

解决方案

As I said in a comment to your question, after looking at the json module's source code, it does not appear to lend itself to doing what you want. However the goal could be achieved by what is known as monkey-patching (see question What is a monkey patch?). This could be done in your package's __init__.py initialization script and would affect all subsequent json module serialization since modules are generally only loaded once and the result is cached in sys.modules.

The patch changes the default json encoder's default method—the default default().

Here's an example implemented as a standalone module for simplicity's sake:

Module: make_json_serializable.py

""" Module that monkey-patches json module when it's imported so
JSONEncoder.default() automatically checks for a special "to_json()"
method and uses it to encode the object if found.
"""
from json import JSONEncoder

def _default(self, obj):
    return getattr(obj.__class__, "to_json", _default.default)(obj)

_default.default = JSONEncoder.default  # Save unmodified default.
JSONEncoder.default = _default # Replace it.

Using it is trivial since the patch is applied by simply importing the module.

Sample client script:

import json
import make_json_serializable  # apply monkey-patch

class Foo(object):
    def __init__(self, name):
        self.name = name
    def to_json(self):  # New special method.
        """ Convert to JSON format string representation. """
        return '{"name": "%s"}' % self.name

foo = Foo('sazpaz')
print(json.dumps(foo))  # -> "{"name": "sazpaz"}"

To retain the object type information, the special method can also include it in the string returned:

        return ('{"type": "%s", "name": "%s"}' %
                 (self.__class__.__name__, self.name))

Which produces the following JSON that now includes the class name:

"{"type": "Foo", "name": "sazpaz"}"

Magick Lies Here

Even better than having the replacement default() look for a specially named method, would be for it to be able to serialize most Python objects automatically, including user-defined class instances, without needing to add a special method. After researching a number of alternatives, the following — based on an answer by @Raymond Hettinger to another question — which uses the pickle module, seemed closest to that ideal to me:

Module: make_json_serializable2.py

""" Module that imports the json module and monkey-patches it so
JSONEncoder.default() automatically pickles any Python objects
encountered that aren't standard JSON data types.
"""
from json import JSONEncoder
import pickle

def _default(self, obj):
    return {'_python_object': pickle.dumps(obj)}

JSONEncoder.default = _default  # Replace with the above.

Of course everything can't be pickled—extension types for example. However there are ways defined to handle them via the pickle protocol by writing special methods—similar to what you suggested and I described earlier—but doing that would likely be necessary for a far fewer number of cases.

Deserializing

Regardless, using the pickle protocol also means it would be fairly easy to reconstruct the original Python object by providing a custom object_hook function argument on any json.loads() calls that used any '_python_object' key in the dictionary passed in, whenever it has one. Something like:

def as_python_object(dct):
    try:
        return pickle.loads(str(dct['_python_object']))
    except KeyError:
        return dct

pyobj = json.loads(json_str, object_hook=as_python_object)

If this has to be done in many places, it might be worthwhile to define a wrapper function that automatically supplied the extra keyword argument:

json_pkloads = functools.partial(json.loads, object_hook=as_python_object)

pyobj = json_pkloads(json_str)

Naturally, this could be monkey-patched it into the json module as well, making the function the default object_hook (instead of None).

I got the idea for using pickle from an answer by Raymond Hettinger to another JSON serialization question, whom I consider exceptionally credible as well as an official source (as in Python core developer).

Portability to Python 3

The code above does not work as shown in Python 3 because json.dumps() returns a bytes object which the JSONEncoder can't handle. However the approach is still valid. A simple way to workaround the issue is to latin1 "decode" the value returned from pickle.dumps() and then "encode" it from latin1 before passing it on to pickle.loads() in the as_python_object() function. This works because arbitrary binary strings are valid latin1 which can always be decoded to Unicode and then encoded back to the original string again (as pointed out in this answer by Sven Marnach).

(Although the following works fine in Python 2, the latin1 decoding and encoding it does is superfluous.)

from decimal import Decimal

class PythonObjectEncoder(json.JSONEncoder):
    def default(self, obj):
        return {'_python_object': pickle.dumps(obj).decode('latin1')}


def as_python_object(dct):
    try:
        return pickle.loads(dct['_python_object'].encode('latin1'))
    except KeyError:
        return dct


class Foo(object):  # Some user-defined class.
    def __init__(self, name):
        self.name = name

    def __eq__(self, other):
        if type(other) is type(self):  # Instances of same class?
            return self.name == other.name
        return NotImplemented

    __hash__ = None


data = [1,2,3, set(['knights', 'who', 'say', 'ni']), {'key':'value'},
        Foo('Bar'), Decimal('3.141592653589793238462643383279502884197169')]
j = json.dumps(data, cls=PythonObjectEncoder, indent=4)
data2 = json.loads(j, object_hook=as_python_object)
assert data == data2  # both should be same

这篇关于使用常规编码器使对象 JSON 可序列化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆