如果列数太多,是否应该将表除以OneToOneField? [英] Should I divide a table by OneToOneField if the number of columns is too many?

查看:84
本文介绍了如果列数太多,是否应该将表除以OneToOneField?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个学生模型,该模型已经有太多字段,包括学生的姓名,国籍,地址,语言,旅行历史等.如下:

class Student(Model):
    user = OneToOneField(CustomUser, on_delete=CASCADE)
    #  Too many other fields

一个学生拥有更多的信息,我将这些信息存储在其他与学生模型具有OneToOne关系的表中,例如:

class StudentIelts(Model):

    student = OneToOneField(Student, on_delete=CASCADE)
    has_ielts = BooleanField(default=False,)
    # 8 other fields for IELTS including the scores and the date
    # and file field for uploading the IELTS result

# I have other models for Toefl, GMAT, GRE, etc that 
# are related to the student model in the same manner through 
# a OneToOne relationship such as:

class StudentIBT(Model):

    student = OneToOneField(Student, on_delete=CASCADE)
    has_ibt = BooleanField(default=False,)
    # other fields

我应该将表合并到一个表中,还是当前的数据库架构好?

之所以选择此架构,是因为我不喜欢使用具有太多列的表.关键是,对于每个学生,都应该有一个针对雅思考试和其他模型的表,因此,例如,学生表中的行数与雅思表中的行数相同.

解决方案

这是一个很难回答的问题,有很多不同的见解,但是我认为将您的关系分为两个单独的模型是正确的. /p>

但是,有几个注意事项要考虑.

从数据库设计的角度来看,几乎没有理由拆分数据库表.只要始终存在一对一的关系,就应该将其合并到一个表中.除非您正在优化数据库,否则列的数量几乎无关紧要.

这个问题的答案总结了一下造成一对一关系很好的实际物理原因:

  • 您可能想以不同的方式对1:1关系的两个端点"表进行群集或分区.
  • 如果您的DBMS允许,您可能需要将它们放在不同的物理磁盘上(例如,对SSD以及其他磁盘而言,对性能要求更高的磁盘) 在便宜的硬盘上).
  • 您已经测量了对缓存的影响,并且想要确保热"列保留在缓存中,而没有冷"列 污染"它.
  • 您需要一个比整个行窄"的并发行为(例如锁定).这是高度特定于DBMS的.
  • 您在不同的列上需要不同的安全性,但是您的DBMS不支持列级权限.
  • 触发器通常是特定于表的.从理论上讲,您只能有一张表,而触发器则忽略了错误的一半" 在该行中,某些数据库可能会对触发条件施加额外的限制 可以也不能做.例如,Oracle不允许您修改 在行级触发器中称为变异"表-通过单独 表中,只有其中一个可能正在变异,因此您仍然可以修改 其他触发因素(但还有其他解决方法 那个.)

数据库非常擅长处理数据,所以我不会拆分 该表仅用于更新性能,除非您执行了 代表数据量的实际基准并得出结论 性能差异确实存在并且足够显着 (例如,以抵消加入JOINing的需求增加).

Django的立场

如果您查看Django的设计方式,则有一些优点可以将表拆分为一对一的关系. Django的一种设计哲学是松耦合" .在Django生态系统中,这意味着单独的应用程序不必相互了解即可正常运行.在您的情况下,可能会争辩说,学生模型不必了解其雅思考试,因为如果您将二者分开,则可以在其他应用程序中重用学生模型. 另外,某些功能可以对雅思考试进行某种分析,而不必知道"参加该考试的学生的任何信息.

但是请谨慎使用此设计模式.要问自己一个很好的问题不一定是我的模型中可能有多少列?",因为有时有充分的理由在一个模型中拥有很多数据.因此,仅对这个问题回答是就不必拆分表.最好问自己一个问题:我是否想将这两种类型的数据的职责/功能分开?",这可能是出于任何原因,例如可重用性或安全性.

I have a student model that already has too many fields including the name, nationality, address, language, travel history, etc of the student. It is as below:

class Student(Model):
    user = OneToOneField(CustomUser, on_delete=CASCADE)
    #  Too many other fields

A student has much more information I store in other tables with a OneToOne relationship with the student model such as:

class StudentIelts(Model):

    student = OneToOneField(Student, on_delete=CASCADE)
    has_ielts = BooleanField(default=False,)
    # 8 other fields for IELTS including the scores and the date
    # and file field for uploading the IELTS result

# I have other models for Toefl, GMAT, GRE, etc that 
# are related to the student model in the same manner through 
# a OneToOne relationship such as:

class StudentIBT(Model):

    student = OneToOneField(Student, on_delete=CASCADE)
    has_ibt = BooleanField(default=False,)
    # other fields

Should I merge the tables into one table or the current database schema is good?

The reason I chose this schema is because I was not comfortable working with a table with too many columns. The point is that for every student, there should be a table for IELTS and other models and, as a result, the number of rows in Student table is the same as the number of rows in the IELTS table, as an example.

解决方案

This is a hard question to answer, with a lot of different opinions, but I would say you are correct in splitting up your relationship into two separate models.

However, there are several considerations to take into account.

When looking at this from a database design perspective, there is hardly any reason to split up your database tables. Whenever there is a one-to-one relationship that is always there, you should merge it into one table. The amount of columns hardly matters, unless you are optimising your database.

An answer from this question sums up the actual physical reasons to split up a 1-to-1 relationship quite nicely:

  • You might want to cluster or partition the two "endpoint" tables of a 1:1 relationship differently.
  • If your DBMS allows it, you might want to put them on different physical disks (e.g. more performance-critical on an SSD and the other on a cheap HDD).
  • You have measured the effect on caching and you want to make sure the "hot" columns are kept in cache, without "cold" columns "polluting" it.
  • You need a concurrency behavior (such as locking) that is "narrower" than the whole row. This is highly DBMS-specific.
  • You need different security on different columns, but your DBMS does not support column-level permissions.
  • Triggers are typically table-specific. While you can theoretically have just one table and have the trigger ignore the "wrong half" of the row, some databases may impose additional limits on what a trigger can and cannot do. For example, Oracle doesn't let you modify the so called "mutating" table from a row-level trigger - by having separate tables, only one of them may be mutating so you can still modify the other from your trigger (but there are other ways to work-around that).

Databases are very good at manipulating the data, so I wouldn't split the table just for the update performance, unless you have performed the actual benchmarks on representative amounts of data and concluded the performance difference is actually there and significant enough (e.g. to offset the increased need for JOINing).

Django's standpoint

If you look at the way Django is designed, there is some merit to splitting up your table into one-to-one relationships. One of Django's design philosophies is 'Loose coupling'. Which in Django's ecosystem means that separate applications shouldn't have to know about each other to function properly. In you case, it could be argued that a Student model shouldn't have to know anything about it's IELTS tests, because if you separate those two, the Student model could be reused in some other application. Also, some functionality that does some kind of analysis over IELTS tests, shouldn't have to 'know' anything about the student that took this test.

Do use this design pattern with some caution though. A good question to ask yourself would be not necessarily "How may columns do I have in my model?", because sometimes there is a good reason to have a lot of data in one model. So answering yes to this question alone would not necessarily merit splitting up your tables. A better question to ask yourself would be "Do I want to separate responsibilities/functionality of these two types of data?", which could be for any reason, like reusability or security.

这篇关于如果列数太多,是否应该将表除以OneToOneField?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆