Django MPTT使用DRF有效地序列化关系数据 [英] Django MPTT efficiently serializing relational data with DRF

查看:140
本文介绍了Django MPTT使用DRF有效地序列化关系数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个类别模型,它是MPTT模型.Group的单位为m2m,我需要使用相关计数来序列化该树,想象一下我的Category树是这样的:

I have a Category model that is a MPTT model. It is m2m to Group and I need to serialize the tree with related counts, imagine my Category tree is this:

Root (related to 1 group)
 - Branch (related to 2 groups) 
    - Leaf (related to 3 groups)
...

所以序列化的输出看起来像这样:

So the serialized output would look like this:

{ 
    id: 1, 
    name: 'root1', 
    full_name: 'root1',
    group_count: 6,
    children: [
    {
        id: 2,
        name: 'branch1',
        full_name: 'root1 - branch1',
        group_count: 5,
        children: [
        {
            id: 3,
            name: 'leaf1',
            full_name: 'root1 - branch1 - leaf1',
            group_count: 3,
            children: []
        }]
    }]
}

这是我当前的超低效率实现方式:

This is my current super inefficient implementation:

模型

class Category(MPTTModel):
    name = ...
    parent = ... (related_name='children')

    def get_full_name(self):
        names = self.get_ancestors(include_self=True).values('name')
        full_name = ' - '.join(map(lambda x: x['name'], names))
        return full_name

    def get_group_count(self):
        cats = self.get_descendants(include_self=True)
        return Group.objects.filter(categories__in=cats).count()

查看

class CategoryViewSet(ModelViewSet):
    def list(self, request):
        tree = cache_tree_children(Category.objects.filter(level=0))
        serializer = CategorySerializer(tree, many=True)
        return Response(serializer.data)

序列化器

class RecursiveField(serializers.Serializer):
    def to_native(self, value):
        return self.parent.to_native(value)


class CategorySerializer(serializers.ModelSerializer):
    children = RecursiveField(many=True, required=False)
    full_name = serializers.Field(source='get_full_name')
    group_count = serializers.Field(source='get_group_count')

    class Meta:
        model = Category
        fields = ('id', 'name', 'children', 'full_name', 'group_count')

这行之有效,但还会以疯狂的查询次数打入数据库,还有其他关系,而不仅仅是Group.有办法提高效率吗?如何编写自己的序列化器?

This works but also hits the DB with an insane number of queries, also there's additional relations, not just Group. Is there a way to make this efficient? How can I write my own serializer?

推荐答案

您肯定会遇到N + 1查询问题,我已经介绍过

You are definitely running into a N+1 query issue, which I have covered in detail in another Stack Overflow answer. I would recommend reading up on optimizing queries in Django, as this is a very common issue.

现在,Django MPTT还存在一些问题,您需要解决N + 1个查询. self.get_ancestors self.get_descendants 方法均会创建一个新的查询集,在您的情况下,您要序列化的每个对象都会发生此查询集.您可能想寻找一种更好的方法来避免这些问题,下面已描述了可能的改进.

Now, Django MPTT also has a few problems that you are going to need to work around as far as N+1 queries. Both the self.get_ancestors and self.get_descendants methods create a new queryset, which in your case happens for every object that you are serializing. You may want to look into a better way to avoid these, I've described possible improvements below.

在您的 get_full_name 方法中,您正在调用 self.get_ancestors 以生成正在使用的链.考虑到生成输出时始终有父级,将其移到 SerializerMethodField 可以重用父级对象来生成名称,您可能会受益.可能会发生以下情况:

In your get_full_name method, you are calling self.get_ancestors in order to generate the chain that is being used. Considering you always have the parent when you are generating the output, you may benefit from moving this to a SerializerMethodField that reuses the parent object to generate the name. Something like the following may work:

class RecursiveField(serializers.Serializer):

    def to_native(self, value):
        return CategorySerializer(value, context={"parent": self.parent.object, "parent_serializer": self.parent})

class CategorySerializer(serializers.ModelSerializer):
    children = RecursiveField(many=True, required=False)
    full_name = SerializerMethodField("get_full_name")
    group_count = serializers.Field(source='get_group_count')

    class Meta:
        model = Category
        fields = ('id', 'name', 'children', 'full_name', 'group_count')

    def get_full_name(self, obj):
        name = obj.name

        if "parent" in self.context:
            parent = self.context["parent"]

            parent_name = self.context["parent_serializer"].get_full_name(parent)

            name = "%s - %s" % (parent_name, name, )

        return name

您可能需要稍微修改一下此代码,但是总体思路是,您不一定总是需要获得祖先,因为您已经拥有了祖先链.

You may need to edit this code slightly, but the general idea is that you don't always need to get the ancestors because you will have the ancestor chain already.

这不能解决您可能无法优化的 Group 查询,但至少应减少查询量.递归查询非常难以优化,通常需要进行大量规划才能弄清楚如何才能最好地获取所需的数据,而又不会陷入N + 1的情况.

This doesn't solve the Group queries, which you may not be able to optimize, but it should at least reduce your queries. Recursive queries are incredibly difficult to optimize, and they usually take a lot of planning to figure out how you can best get the required data without falling back to N+1 situations.

这篇关于Django MPTT使用DRF有效地序列化关系数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆