基本数据模型模式的Django多表继承替代方案 [英] Django multi-table inheritance alternatives for basic data model pattern

查看:87
本文介绍了基本数据模型模式的Django多表继承替代方案的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

tl; dr



在Django中实现下面描述的基本数据模型模式时,有多表继承的简单替代方法吗?



前提



请基于以下示例考虑下图中非常基本的数据模型模式



请注意,每个子-model引入了其他字段(此处未显示,但请参见下面的代码示例)。



此特定示例有几个明显的缺点,但这并不重要。为了便于讨论,假设该模式完美地描述了我们希望实现的目标,因此剩下的唯一问题是如何在Django中实现该模式



实现



我认为最明显的实现将使用多表继承

  class Party(models.Model):
请注意,这是一个具体模型,而不是抽象模型。
name = models.CharField(max_length = 20)


类组织(聚会):

注意,一对一的关系 party_ptr会自动添加,
,这是用作主键(实际表没有'id'
列)。Person也是如此。

type = models.CharField(max_length = 20)


类Person(Party):
最喜欢的颜色= models.Char Field(max_length = 20)


class Address(models.Model):

注意,因为Party是一个具体的模型,而不是一个抽象
一个,我们可以直接在外键中引用它。

由于Person和Organization模型与作为主键的Party具有一对一关系
,我们可以方便地创建
Address对象,设置party = party_instance,
party = organization_instance或party = person_instance。


party = models.ForeignKey(to = Party,on_delete = models.CASCADE)

这似乎完全符合模式,这几乎使我相信这是多表继承首先要实现的目的。



但是,多表继承似乎 ,尤其是从性能的角度来看是不赞成的,尽管取决于应用程序,尤其是吓人,但很古老,来自Django一位创建者的帖子令人沮丧:


<从长远来看,在几乎每种情况下,抽象继承都是一种更好的方法,我已经看到很多站点在具体继承引入的负载下被压垮了,因此我强烈建议Django用户使用conr的任何方法ete继承带有大量的怀疑态度。


尽管有这个可怕的警告,我想这篇文章的主要观点是关于以下内容的观察多表继承:


这些联接往往是隐藏的,它们是自动创建的,意味着看起来像简单的查询


消除歧义:上面的帖子将Django的多表继承称为具体继承,不应与具体表继承混淆。后者实际上更符合Django使用抽象基类的继承概念。



我想这样的问题很好地说明了隐藏的联接问题。



替代项



抽象继承确实对于我来说,这似乎不是一个可行的选择,因为我们无法为抽象模型设置外键,这很有意义,因为它没有表。我想这意味着我们需要为每个子模型加上一个外键,并需要一些额外的逻辑来模拟这一点。



代理继承似乎也不是一个选择,因为每个子模型都会引入额外的字段。 编辑:再考虑一下,如果我们使用单表继承在数据库级别,即使用包含来自 Party Organization的所有字段的单个表 Person



GenericForeignKey 关系可能是某些特定情况,但对我来说,它们只是噩梦。



作为另一种选择,通常建议使用显式的-一对一的关系(此处为 eoto ),而不是多表继承(因此 Party Person Organization 都只是 models.Model )的子类。



启动在这种方法中,多表继承( mti )和显式一对一关系( eoto )产生了三个数据库表。因此,当然,取决于查询类型,某种形式的 JOIN 在检索数据时通常是不可避免的。



通过检查数据库中的结果表,很明显 mti eoto 之间的唯一区别在数据库级别上,方法是 eoto Person 表具有 id 列作为主键,并在 Party.id 中有一个单独的外键列,而 mti Person 表具有 no 单独的 id 列,但是使用外键来表示 Party.id 作为其主键。



问题



我不认为示例中的行为(特别是与父级的直接关系)可以通过抽象继承实现,是吗?如果可以,那么您将如何实现呢?



明确的一对一关系真的比多表好吗? -继承,除了它迫使我们使查询更加明确的事实之外?对我来说,多表方法的方便性和清晰度远胜于明确性参数。



注意这个SO问题非常相似,但是并不能完全回答我的问题。而且,最新答案现在已经有近 9年了,而Django从那以后发生了很大变化。



[1]: 1996年干草,数据模型模式

解决方案

在等待更好的解决方案时,这是我的答案。


根据值得一提,它是从数据库角度解决问题的。由于我在数据库设计方面的经验有限,因此这部分我必须依靠其他人。


如果我在任何时候有错,请纠正我。


数据模型与(面向对象的)应用程序与(关系的)数据库


关于对象/关系不匹配
,或更准确地说,是数据模型/对象/关系不匹配。


当前
上下文,我想有必要指出,在数据模型
面向对象实现(Django)和关系数据库实现,并非总是
可能,甚至不是所希望的。一个很好的三向维恩图可能可以说明这一点。


数据模型级别


对我来说,如下图所示原始帖子代表试图捕捉现实世界信息系统的本质。它应该足够详细和灵活,以使我们能够实现自己的目标。它没有规定实现细节,但是可能会限制我们的选择。


在这种情况下,继承主要在数据库实现级别上构成了挑战。


< h2>关系数据库级别

一些处理(单个)继承的数据库实现的SO答案是:



这些或多或少都遵循马丁·福勒(Martin Fowler)的书
应用架构的模式
在给出更好的答案之前,我倾向于相信这些观点。
第3章(2011年版)的继承部分很好地总结了这一点:


对于任何继承结构,基本上都有三个选择。
层次结构中的所有类都可以有一个表:单表继承(278)...;
每个具体类一个表:具体表继承(293)...;
或层次结构中每个类的一个表:类表继承(285)...


and


权衡全在重复数据结构和访问速度之间。 ...
这里没有明确的赢家。 ...我的首选是单表继承 ...


从书中发现了模式的摘要在 martinfowler.com 上。


应用程序级别


Django的对象关系映射(ORM)
允许我们实现这三种方法,尽管映射不是
严格一对一的。


Django 模型继承文档
可以区分三个继承样式,取决于使用的模型类的类型(混凝土摘要 proxy ):


  1. 抽象父母,有具体个孩子(< a href = https://docs.djangoproject.com/zh-CN/2.1/topics/db/models/#abstract-base-classes rel = nofollow noreferrer>抽象基类):
    父类具有 no 数据库表。相反,每个子类都有其自己的数据库
    表,该表具有自己的字段和父字段的重复项。
    这听起来很像数据库中的混凝土表继承



  2. 具体父级,带有具体个子级(多表继承):
    父类的数据库表及其自己的字段,每个子类
    都有自己的表,该表具有自己的字段和
    父表的外键(作为主键)。
    这看起来像数据库中的类表继承



  3. 具体父级,带 proxy 子级(代理模型):
    父类具有数据库表,但子类 do不是
    而是,子类直接与父表进行交互。
    现在,如果我们将子级(在数据模型中定义)
    的所有字段添加到父类
    ,则可以将其解释为$ b的实现$ b 单表继承
    代理模型提供了一种处理单个大型数据库表
    的应用程序端的便捷方法。




结论


在我看来,对于本示例,单表继承与Django的 proxy

适用于原始帖子中的示例,看起来像这样:

  class Party(models.Model):
。层次结构中的所有字段都在此类上。
名称= models.CharField(max_length = 20)
type = models.CharField(max_length = 20)
favorite_color = models.CharField(max_length = 20)


类组织(聚会):
类元:
代理没有数据库表(它使用父表)。
代理=真

def __str __(自己):
我们可以对代理服务器进行子类特定的操作。
return'{}是一个{}'。format(self.name,self.type)


类Person(Party):
类Meta:
proxy = True

def __str __(self):
return'{} likes {}'。format(self.name,self.favorite_color)


类Address(models.Model):

根据需要,我们可以链接到Party,但是我们可以使用
来设置字段,party = person_instance,party = organization_instance,
或party = party_instance
;
party = models.ForeignKey(to = Party,on_delete = models.CASCADE)

一个警告,从 Django代理模型文档


没有办法让Django返回 MyPerson 对象时,都使用$ c>对象。 Person 对象的查询集将返回这些类型的对象。


提出了一种可能的解决方法< a href = https://stackoverflow.com/a/60894618>此处。


tl;dr

Is there a simple alternative to multi-table inheritance for implementing the basic data-model pattern depicted below, in Django?

Premise

Please consider the very basic data-model pattern in the image below, based on e.g. Hay, 1996.

Simply put: Organizations and Persons are Parties, and all Parties have Addresses. A similar pattern may apply to many other situations.

The important point here is that the Address has an explicit relation with Party, rather than explicit relations with the individual sub-models Organization and Person.

Note that each sub-model introduces additional fields (not depicted here, but see code example below).

This specific example has several obvious shortcomings, but that is beside the point. For the sake of this discussion, suppose the pattern perfectly describes what we wish to achieve, so the only question that remains is how to implement the pattern in Django.

Implementation

The most obvious implementation, I believe, would use multi-table-inheritance:

class Party(models.Model):
    """ Note this is a concrete model, not an abstract one. """
    name = models.CharField(max_length=20)


class Organization(Party):
    """ 
    Note that a one-to-one relation 'party_ptr' is automatically added, 
    and this is used as the primary key (the actual table has no 'id' 
    column). The same holds for Person.
    """
    type = models.CharField(max_length=20)


class Person(Party):
    favorite_color = models.CharField(max_length=20)


class Address(models.Model):
    """ 
    Note that, because Party is a concrete model, rather than an abstract
    one, we can reference it directly in a foreign key.

    Since the Person and Organization models have one-to-one relations 
    with Party which act as primary key, we can conveniently create 
    Address objects setting either party=party_instance,
    party=organization_instance, or party=person_instance.

    """
    party = models.ForeignKey(to=Party, on_delete=models.CASCADE)

This seems to match the pattern perfectly. It almost makes me believe this is what multi-table-inheritance was intended for in the first place.

However, multi-table-inheritance appears to be frowned upon, especially from a performance point-of-view, although it depends on the application. Especially this scary, but ancient, post from one of Django's creators is quite discouraging:

In nearly every case, abstract inheritance is a better approach for the long term. I’ve seen more than few sites crushed under the load introduced by concrete inheritance, so I’d strongly suggest that Django users approach any use of concrete inheritance with a large dose of skepticism.

Despite this scary warning, I guess the main point in that post is the following observation regarding multi-table inheritance:

These joins tend to be "hidden" — they’re created automatically — and mean that what look like simple queries often aren’t.

Disambiguation: The above post refers to Django's "multi-table inheritance" as "concrete inheritance", which should not be confused with Concrete Table Inheritance on the database level. The latter actually corresponds better with Django's notion of inheritance using abstract base classes.

I guess this SO question nicely illustrates the "hidden joins" issue.

Alternatives

Abstract inheritance does not seem like a viable alternative to me, because we cannot set a foreign key to an abstract model, which makes sense, because it has no table. I guess this implies that we would need a foreign key for every "child" model plus some extra logic to simulate this.

Proxy inheritance does not seem like an option either, as the sub-models each introduce extra fields. EDIT: On second thought, proxy models could be an option if we use Single Table Inheritance on the database level, i.e. use a single table that includes all the fields from Party, Organization and Person.

GenericForeignKey relations may be an option in some specific cases, but to me they are the stuff of nightmares.

As another alternative, it is often suggested to use explicit one-to-one relations (eoto for short, here) instead of multi-table-inheritance (so Party, Person and Organization would all just be subclasses of models.Model).

Both approaches, multi-table-inheritance (mti) and explicit one-to-one relations (eoto), result in three database tables. So, depending on the type of query, of course, some form of JOIN is often inevitable when retrieving data.

By inspecting the resulting tables in the database, it becomes clear that the only difference between the mti and eoto approaches, on the database level, is that an eoto Person table has an id column as primary-key, and a separate foreign-key column to Party.id, whereas an mti Person table has no separate id column, but instead uses the foreign-key to Party.id as its primary-key.

Question(s)

I don't think the behavior from the example (especially the single direct relation to the parent) can be achieved with abstract inheritance, can it? If it can, then how would you achieve that?

Is an explicit one-to-one relation really that much better than multi-table-inheritance, except for the fact that it forces us to make our queries more explicit? To me the convenience and clarity of the multi-table approach outweighs the explicitness argument.

Note that this SO question is very similar, but does not quite answer my questions. Moreover, the latest answer there is almost nine years old now, and Django has changed a lot since.

[1]: Hay 1996, Data Model Patterns

解决方案

While awaiting a better one, here's my attempt at an answer.

As suggested by Kevin Christopher Henry in the comments above, it makes sense to approach the problem from the database side. As my experience with database design is limited, I have to rely on others for this part.

Please correct me if I'm wrong at any point.

Data-model vs (Object-Oriented) Application vs (Relational) Database

A lot can be said about the object/relational mismatch, or, more accurately, the data-model/object/relational mismatch.

In the present context I guess it is important to note that a direct translation between data-model, object-oriented implementation (Django), and relational database implementation, is not always possible or even desirable. A nice three-way Venn-diagram could probably illustrate this.

Data-model level

To me, a data-model as illustrated in the original post represents an attempt to capture the essence of a real world information system. It should be sufficiently detailed and flexible to enable us to reach our goal. It does not prescribe implementation details, but may limit our options nonetheless.

In this case, the inheritance poses a challenge mostly on the database implementation level.

Relational database level

Some SO answers dealing with database implementations of (single) inheritance are:

These all more or less follow the patterns described in Martin Fowler's book Patterns of Application Architecture. Until a better answer comes along, I am inclined to trust these views. The inheritance section in chapter 3 (2011 edition) sums it up nicely:

For any inheritance structure there are basically three options. You can have one table for all the classes in the hierarchy: Single Table Inheritance (278) ...; one table for each concrete class: Concrete Table Inheritance (293) ...; or one table per class in the hierarchy: Class Table Inheritance (285) ...

and

The trade-offs are all between duplication of data structure and speed of access. ... There's no clearcut winner here. ... My first choice tends to be Single Table Inheritance ...

A summary of patterns from the book is found on martinfowler.com.

Application level

Django's object-relational mapping (ORM) API allows us to implement these three approaches, although the mapping is not strictly one-to-one.

The Django Model inheritance docs distinguish three "styles of inheritance", based on the type of model class used (concrete, abstract, proxy):

  1. abstract parent with concrete children (abstract base classes): The parent class has no database table. Instead each child class has its own database table with its own fields and duplicates of the parent fields. This sounds a lot like Concrete Table Inheritance in the database.

  2. concrete parent with concrete children (multi-table inheritance): The parent class has a database table with its own fields, and each child class has its own table with its own fields and a foreign-key (as primary-key) to the parent table. This looks like Class Table Inheritance in the database.

  3. concrete parent with proxy children (proxy models): The parent class has a database table, but the children do not. Instead, the child classes interact directly with the parent table. Now, if we add all the fields from the children (as defined in our data-model) to the parent class, this could be interpreted as an implementation of Single Table Inheritance. The proxy models provide a convenient way of dealing with the application side of the single large database table.

Conclusion

It seems to me that, for the present example, the combination of Single Table Inheritance with Django's proxy models may be a good solution that does not have the disadvantages of "hidden" joins.

Applied to the example from the original post, it would look something like this:

class Party(models.Model):
    """ All the fields from the hierarchy are on this class """
    name = models.CharField(max_length=20)
    type = models.CharField(max_length=20)
    favorite_color = models.CharField(max_length=20)


class Organization(Party):
    class Meta:
        """ A proxy has no database table (it uses the parent's table) """
        proxy = True

    def __str__(self):
        """ We can do subclass-specific stuff on the proxies """
        return '{} is a {}'.format(self.name, self.type)


class Person(Party):
    class Meta:
        proxy = True

    def __str__(self):
        return '{} likes {}'.format(self.name, self.favorite_color)


class Address(models.Model):
    """ 
    As required, we can link to Party, but we can set the field using
    either party=person_instance, party=organization_instance, 
    or party=party_instance
    """
    party = models.ForeignKey(to=Party, on_delete=models.CASCADE)

One caveat, from the Django proxy-model documentation:

There is no way to have Django return, say, a MyPerson object whenever you query for Person objects. A queryset for Person objects will return those types of objects.

A potential workaround is presented here.

这篇关于基本数据模型模式的Django多表继承替代方案的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆