如何在Python中使json.dumps忽略不可序列化的字段 [英] How to make json.dumps in Python ignore a non-serializable field
问题描述
我正在尝试使用Construct2.9库序列化解析某些二进制数据的输出.我想将结果序列化为JSON.
I am trying to serialize the output of parsing some binary data with the Construct2.9 library. I want to serialize the result to JSON.
packet
是Construct类Container
的实例.
packet
is an instance of a Construct class Container
.
显然,它包含类型为BytesIO
的隐藏的_io
-请参见下面的dict(packet)
输出:
Apparently it contains a hidden _io
of type BytesIO
- see output of dict(packet)
below:
{
'packet_length': 76, 'uart_sent_time': 1, 'frame_number': 42958,
'subframe_number': 0, 'checksum': 33157, '_io': <_io.BytesIO object at 0x7f81c3153728>,
'platform':661058, 'sync': 506660481457717506, 'frame_margin': 20642,
'num_tlvs': 1, 'track_process_time': 593, 'chirp_margin': 78,
'timestamp': 2586231182, 'version': 16908293
}
现在,调用json.dumps(packet)
显然会导致TypeError:
Now, calling json.dumps(packet)
obviously leads to a TypeError:
...
File "/usr/lib/python3.5/json/__init__.py", line 237, in dumps
**kw).encode(obj)
File "/usr/lib/python3.5/json/encoder.py", line 198, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python3.5/json/encoder.py", line 256, in iterencode
return _iterencode(o, 0)
File "/usr/lib/python3.5/json/encoder.py", line 179, in default
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: <_io.BytesIO object at 0x7f81c3153728> is not JSON serializable
不过,令我感到困惑的是,运行json.dumps(packet, skipkeys=True)
会导致完全相同的错误,而我希望它会跳过_io
字段.这里有什么问题?为什么skipkeys
不允许我跳过_io
字段?
However what I am confused about, is that running json.dumps(packet, skipkeys=True)
results in the exact same error, while I would expect it to skip the _io
field. What is the problem here? Why is skipkeys
not allowing me to skip the _io
field?
我可以通过覆盖JSONEncoder
并为BytesIO
类型的字段返回None
来工作,但这意味着我的序列化字符串包含"_io": null
元素的负载,我不希望完全没有...
I got the code to work by overriding JSONEncoder
and returning None
for fields of BytesIO
type, but that means my serialized string contains loads of "_io": null
elements, which I would prefer not to have at all...
推荐答案
带有_
下划线的键并不是真正的隐藏"键,它们只是JSON的更多字符串. Construct Container
类只是一个有序的字典,_io
键对该类没有什么特殊要求.
Keys with a leading _
underscore are not really 'hidden', they are just more strings to JSON. The Construct Container
class is just a dictionary with ordering, the _io
key is not anything special to that class.
您有两个选择:
- 实现一个
default
挂钩,该挂钩仅返回替换值. - 过滤掉在序列化之前 不能使用的键/值对.
- implement a
default
hook that just returns a replacement value. - Filter out the key-value pairs that you know can't work before serialising.
也许还有第三个,但是对Construct项目页面的随意扫描并没有告诉我是否可用:可以通过使用适配器使Construct输出JSON或至少一个与JSON兼容的字典.
and perhaps a third, but a casual scan of the Construct project pages doesn't tell me if it is available: have Construct output JSON or at least a JSON-compatible dictionary, perhaps by using adapters.
默认钩子无法阻止将_io
键添加到输出中,但是至少可以避免该错误:
The default hook can't prevent the _io
key from being added to the output, but would let you at least avoid the error:
json.dumps(packet, default=lambda o: '<not serializable>')
过滤可以递归进行; @functools.singledispatch()
装饰器可以帮助保持此类代码整洁:
Filtering can be done recursively; the @functools.singledispatch()
decorator can help keep such code clean:
from functools import singledispatch
_cant_serialize = object()
@singledispatch
def json_serializable(object, skip_underscore=False):
"""Filter a Python object to only include serializable object types
In dictionaries, keys are converted to strings; if skip_underscore is true
then keys starting with an underscore ("_") are skipped.
"""
# default handler, called for anything without a specific
# type registration.
return _cant_serialize
@json_serializable.register(dict)
def _handle_dict(d, skip_underscore=False):
converted = ((str(k), json_serializable(v, skip_underscore))
for k, v in d.items())
if skip_underscore:
converted = ((k, v) for k, v in converted if k[:1] != '_')
return {k: v for k, v in converted if v is not _cant_serialize}
@json_serializable.register(list)
@json_serializable.register(tuple)
def _handle_sequence(seq, skip_underscore=False):
converted = (json_serializable(v, skip_underscore) for v in seq)
return [v for v in converted if v is not _cant_serialize]
@json_serializable.register(int)
@json_serializable.register(float)
@json_serializable.register(str)
@json_serializable.register(bool) # redudant, supported as int subclass
@json_serializable.register(type(None))
def _handle_default_scalar_types(value, skip_underscore=False):
return value
我在上面的实现中也添加了一个额外的skip_underscore
参数,以显式跳过开头带有_
字符的键.这将有助于跳过Construct库正在使用的所有其他隐藏"属性.
I have the above implementation an additional skip_underscore
argument too, to explicitly skip keys that have a _
character at the start. This would help skip all additional 'hidden' attributes the Construct library is using.
由于Container
是dict
子类,因此上面的代码将自动处理诸如packet
的实例.
Since Container
is a dict
subclass, the above code will automatically handle instances such as packet
.
这篇关于如何在Python中使json.dumps忽略不可序列化的字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!