带有列表注释的嵌套 python 数据类 [英] Nested python dataclasses with list annotations

查看:25
本文介绍了带有列表注释的嵌套 python 数据类的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

蟒蛇^3.7.尝试创建嵌套数据类以处理复杂的 json 响应.我设法通过为 json 的每个级别创建数据类并使用 __post_init_ 将字段设置为其他数据类的对象来做到这一点.然而,这会创建大量样板代码,而且嵌套对象没有注释.

python ^3.7. Trying to create nested dataclasses to work with complex json response. I managed to do that with creating dataclass for every level of json and using __post_init_ to set fields as objects of other dataclasses. However that creates a lot of boilerplate code and also, there is no annotation for nested objects.

这个答案帮助我更接近使用包装器的解决方案:

This answer helped me getting closer to the solution using wrapper:

https://stackoverflow.com/a/51565863/8325015

但是对于属性是对象列表的情况,它不能解决它.some_attribute: List[SomeClass]

However it does not solve it for cases where attribute is list of objects. some_attribute: List[SomeClass]

以下是类似于我的数据的示例:

Here is example that resembles my data:

from dataclasses import dataclass, is_dataclass
from typing import List
from copy import deepcopy

# decorator from the linked thread:
def nested_deco(*args, **kwargs):
    def wrapper(check_class):

        # passing class to investigate
        check_class = dataclass(check_class, **kwargs)
        o_init = check_class.__init__

        def __init__(self, *args, **kwargs):

            for name, value in kwargs.items():

                # getting field type
                ft = check_class.__annotations__.get(name, None)

                if is_dataclass(ft) and isinstance(value, dict):
                    obj = ft(**value)
                    kwargs[name] = obj
                o_init(self, *args, **kwargs)

        check_class.__init__ = __init__

        return check_class

    return wrapper(args[0]) if args else wrapper


#some dummy dataclasses to resemble my data structure

@dataclass
class IterationData:
    question1: str
    question2: str


@nested_deco
@dataclass
class IterationResult:
    name: str
    data: IterationData


@nested_deco
@dataclass
class IterationResults:
    iterations: List[IterationResult]


@dataclass
class InstanceData:
    date: str
    owner: str


@nested_deco
@dataclass
class Instance:
    data: InstanceData
    name: str


@nested_deco
@dataclass
class Result:
    status: str
    iteration_results: IterationResults


@nested_deco
@dataclass
class MergedInstance:
    instance: Instance
    result: Result


#example data

single_instance = {
    "instance": {
        "name": "example1",
        "data": {
            "date": "2021-01-01",
            "owner": "Maciek"
        }
    },
    "result": {
        "status": "complete",
        "iteration_results": [
            {
                "name": "first",
                "data": {
                    "question1": "yes",
                    "question2": "no"
                }
            }
        ]
    }
}

instances = [deepcopy(single_instance) for i in range(3)] #created a list just to resemble mydata
objres = [MergedInstance(**inst) for inst in instances]

你会注意到.nested_deco 适用于 MergedInstance 的属性和 Instance 的属性 data 但它不加载 IterationResultsResultiteration_results 上的 类.

As you will notice. nested_deco works perfectly for attributes of MergedInstance and for attribute data of Instance but it does not load IterationResults class on iteration_results of Result.

有没有办法实现它?

我还附上了我的 post_init 解决方案的示例,该解决方案创建了类对象,但没有属性注释:

I attach also example with my post_init solution which creates objects of classes but there is no annotation of attributes:

@dataclass
class IterationData:
    question1: str
    question2: str


@dataclass
class IterationResult:
    name: str
    data: dict

    def __post_init__(self):
        self.data = IterationData(**self.data)


@dataclass
class InstanceData:
    date: str
    owner: str


@dataclass
class Instance:
    data: dict
    name: str

    def __post_init__(self):
        self.data = InstanceData(**self.data)


@dataclass
class Result:
    status: str
    iteration_results: list

    def __post_init__(self):
        self.iteration_results = [IterationResult(**res) for res in self.iteration_results]


@dataclass
class MergedInstance:
    instance: dict
    result: dict

    def __post_init__(self):
        self.instance = Instance(**self.instance)
        self.result = Result(**self.result)

推荐答案

这并没有真正回答你关于嵌套装饰器的问题,但我最初的建议是通过使用库来避免为自己做很多艰苦的工作以前解决过同样的问题.

This doesn't really answer your question about the nested decorators, but my initial suggestion would be to avoid a lot of hard work for yourself by making use of libraries that have tackled this same problem before.

有很多众所周知的,比如 pydantic,它也提供数据验证,这是我可能会推荐.如果您有兴趣保留现有的 dataclass 结构并且不想从任何东西继承,您可以使用诸如 dataclass-wizarddataclasses-json.后者提供了一种您可能感兴趣的装饰器方法.但理想情况下,目标是找到一个(高效的)JSON 序列化库,它已经提供了您所需要的.

There are lot of well known ones like pydantic which also provides data validation and is something I might recommend. If you are interested in keeping your existing dataclass structure and not wanting to inherit from anything, you can use libraries such as dataclass-wizard and dataclasses-json. The latter one offers a decorator approach which you might interest you. But ideally, the goal is to find a (efficient) JSON serialization library which already offers exactly what you need.

这是一个使用 dataclass-wizard 库的示例,只需进行最少的更改(无需从 mixin 类继承).请注意,我必须稍微修改您的输入 JSON 对象,否则它与数据类架构并不完全匹配.但除此之外,它看起来应该按预期工作.我还删除了 copy.deepcopy,因为它有点慢而且我们不需要它(辅助函数无论如何都不会直接修改 dict 对象,这很简单,可以测试)

Here is an example using the dataclass-wizard library with minimal changes needed (no need to inherit from a mixin class). Note that I had to modify your input JSON object slightly, as it didn't exactly match the dataclass schema otherwise. But otherwise, it looks like it should work as expected. I've also removed copy.deepcopy, as that's a bit slower and we don't need it (the helper functions won't directly modify the dict objects anyway, which is simple enough to test)

from dataclasses import dataclass
from typing import List

from dataclass_wizard import fromlist


@dataclass
class IterationData:
    question1: str
    question2: str


@dataclass
class IterationResult:
    name: str
    data: IterationData


@dataclass
class IterationResults:
    iterations: List[IterationResult]


@dataclass
class InstanceData:
    date: str
    owner: str


@dataclass
class Instance:
    data: InstanceData
    name: str


@dataclass
class Result:
    status: str
    iteration_results: IterationResults


@dataclass
class MergedInstance:
    instance: Instance
    result: Result


single_instance = {
    "instance": {
        "name": "example1",
        "data": {
            "date": "2021-01-01",
            "owner": "Maciek"
        }
    },
    "result": {
        "status": "complete",
        "iteration_results": {
            # Notice i've changed this here - previously syntax was invalid (this was
            # a list)
            "iterations": [
                {
                    "name": "first",
                    "data": {
                        "question1": "yes",
                        "question2": "no"
                    }
                }
            ]
        }
    }
}

instances = [single_instance for i in range(3)]  # created a list just to resemble mydata

objres = fromlist(MergedInstance, instances)

for obj in objres:
    print(obj)

使用 dataclasses-json 库:

from dataclasses import dataclass
from typing import List

from dataclasses_json import dataclass_json


# Same as above
...

@dataclass_json
@dataclass
class MergedInstance:
    instance: Instance
    result: Result


single_instance = {...}

instances = [single_instance for i in range(3)]  # created a list just to resemble mydata

objres = [MergedInstance.from_dict(inst) for inst in instances]

for obj in objres:
    print(obj)


奖励: 假设您正在调用一个 API,该 API 会返回一个复杂的 JSON 响应,例如上面的那个.如果您想将此 JSON 响应转换为数据类模式,通常您必须手动将其写出,如果 JSON 的结构特别复杂,这可能会有点令人厌烦.


Bonus: Let's say you are calling an API that returns you a complex JSON response, such as the one above. If you want to convert this JSON response to a dataclass schema, normally you'll have to write it out by hand, which can be a bit tiresome if the structure of the JSON is especially complex.

如果有一种方法可以简化嵌套数据类结构的生成,那不是很酷吗?dataclass-wizard 库附带一个接受任意 JSON 输入的 CLI 工具,因此在给定这样的输入的情况下自动生成数据类模式当然应该是可行的.

Wouldn't it be cool if there was a way to simplify the generation of a nested dataclass structure? The dataclass-wizard library comes with a CLI tool that accepts an arbitrary JSON input, so it should certainly be doable to auto-generate a dataclass schema given such an input.

假设您在 testing.json 文件中有这些内容:

Assume you have these contents in a testing.json file:

{
    "instance": {
        "name": "example1",
        "data": {
            "date": "2021-01-01",
            "owner": "Maciek"
        }
    },
    "result": {
        "status": "complete",
        "iteration_results": {
            "iterations": [
                {
                    "name": "first",
                    "data": {
                        "question1": "yes",
                        "question2": "no"
                    }
                }
            ]
        }
    }
}

然后我们运行以下命令:

Then we run the following command:

wiz gs testing testing

以及我们新的 testing.py 文件的内容:

And the contents of our new testing.py file:

from dataclasses import dataclass
from datetime import date
from typing import List, Union

from dataclass_wizard import JSONWizard


@dataclass
class Data(JSONWizard):
    """
    Data dataclass

    """
    instance: 'Instance'
    result: 'Result'


@dataclass
class Instance:
    """
    Instance dataclass

    """
    name: str
    data: 'Data'


@dataclass
class Data:
    """
    Data dataclass

    """
    date: date
    owner: str


@dataclass
class Result:
    """
    Result dataclass

    """
    status: str
    iteration_results: 'IterationResults'


@dataclass
class IterationResults:
    """
    IterationResults dataclass

    """
    iterations: List['Iteration']


@dataclass
class Iteration:
    """
    Iteration dataclass

    """
    name: str
    data: 'Data'


@dataclass
class Data:
    """
    Data dataclass

    """
    question1: Union[bool, str]
    question2: Union[bool, str]

这似乎或多或少与原始问题中相同的嵌套数据类结构相匹配,最重要的是我们不需要自己编写任何代码!

That appears to more or less match the same nested dataclass structure from the original question, and best of all we didn't need to write any of the code ourselves!

然而,有一个小问题——由于一些重复的 JSON 键,我们最终得到了三个名为 Data 的数据类.所以我继续将它们重命名为 Data1Data2Data3 以确保唯一性.然后我们可以进行快速测试以确认我们能够将相同的 JSON 数据加载到我们的新数据类架构中:

However, there's a minor problem - because of some duplicate JSON keys, we end up with three data classes named Data. So I've went ahead and renamed them to Data1, Data2, and Data3 for uniqueness. And then we can do a quick test to confirm that we're able to load the same JSON data into our new dataclass schema:

import json
from dataclasses import dataclass
from datetime import date
from typing import List, Union

from dataclass_wizard import JSONWizard


@dataclass
class Data1(JSONWizard):
    """
    Data dataclass

    """
    instance: 'Instance'
    result: 'Result'


@dataclass
class Instance:
    """
    Instance dataclass

    """
    name: str
    data: 'Data2'


@dataclass
class Data2:
    """
    Data dataclass

    """
    date: date
    owner: str


@dataclass
class Result:
    """
    Result dataclass

    """
    status: str
    iteration_results: 'IterationResults'


@dataclass
class IterationResults:
    """
    IterationResults dataclass

    """
    iterations: List['Iteration']


@dataclass
class Iteration:
    """
    Iteration dataclass

    """
    name: str
    data: 'Data3'


@dataclass
class Data3:
    """
    Data dataclass

    """
    question1: Union[bool, str]
    question2: Union[bool, str]


# ---- Start of our test

with open('testing.json') as in_file:
    d = json.load(in_file)

c = Data1.from_dict(d)

print(repr(c))
# Data1(instance=Instance(name='example1', data=Data2(date=datetime.date(2021, 1, 1), owner='Maciek')), result=Result(status='complete', iteration_results=IterationResults(iterations=[Iteration(name='first', data=Data3(question1='yes', question2='no'))])))

这篇关于带有列表注释的嵌套 python 数据类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆