SQL:在保留约束的情况下规范化数据库 [英] SQL: Normalization of database while retaining constraints

查看:253
本文介绍了SQL:在保留约束的情况下规范化数据库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有以下表格:

  ____________________ ____________________ 
|生物| |物种|
| -------------------- | | -------------------- |
| OrganismId(int,PK)| | SpeciesId(int,PK)|
| SpeciesId(int,FK)|∞--------- 1 | Name(varchar)|
|名称(varchar)| | ____________________ |
| ____________________ | 1
1 |
| |
| |
∞∞
______________________ ____________________ _______________
| OrganismPropsValues | | SpeciesProps | |道具|
| ---------------------- | | -------------------- | | --------------- |
| OrganismId(int,FK)| | PropId(int,PK,FK)| ∞----- 1 | PropId(int,PK)|
| PropId(int,FK)| | SpeciesId(int,PK,FK)| |名称(varchar)|
| Value(varchar)| | ____________________ | | _______________ |
| ______________________ | 1
∞|
| |
---------------------------------------------- -------------

快速解释我正在尝试的在这里表示:假设我们有一个种类的列表,如猫,狗,人类等。我们还有一套属性(缩写为Props,所以我可以更容易地适应于图),适用于一些但不一定所有物种 - 例如,这可能是尾部长度(对于具有尾巴的物种),眼睛颜色(对于眼睛的人)等。



SpeciesProps是链接器表这定义了哪些属性适用于哪个物种 - 所以在这里我们将有{人,眼睛颜色},{狗,眼睛颜色},{猫,眼睛颜色},{狗,尾巴长度},{猫,尾巴长度}。我们没有{人类,尾巴长度},因为尾部长度显然不是适用于人的有效属性。



生物体表包含实际的实现物种 - 所以在这里我们可能有{人类,鲍勃},{狗,Rufus}和{猫,菲利克斯}。



现在是我的问题:在OrganismPropsValues表,我想存储每个生物体的属性的值 - 例如,对于Bob,我想存储{Bob,Eye Color,Blue}。对于Rufus,我想要存储{Rufus,Eye Color,Brown}和{Rufus,Tail Length,20}(Felix类似的)。然而,我的问题是,在我详细的模式中,尽管{人,尾长度}元组在SpeciesProps中不存在,但是完全可以存储{Bob,Tail Length,10}。如何修改这个模式,以便我可以在OrganismPropsValues中强制定义在SpeciesProps中的约束,同时保持适当的规范化。

解决方案

重新实施实体属性值反模式。这不是一个规范的数据库设计,因为它不是关系。



我建议的是类表继承设计模式:




  • 为生物体创建一个表,包含所有物种共有的属性。

  • 每个物种创建一个表,其中包含该物种特有的属性。这些表中的每一个与生物体具有1对1的关系,但每个属性都属于其自己的列。

      ____________________ ____________________ 
    |生物| |物种|
    | -------------------- | | -------------------- |
    | OrganismId(int,PK)| | SpeciesId(int,PK)|
    | SpeciesId(int,FK)|∞--------- 1 | Name(varchar)|
    |名称(varchar)| | ____________________ |
    | ____________________ |
    1
    |
    |
    1
    ______________________
    |人类组织|
    | ---------------------- |
    | OrganismId(int,FK)|
    |性(枚举)|
    | Race(int,FK)|
    | EyeColor(int,FK)|
    | .... |
    | ______________________ |




这的确意味着你会创建许多表,但是将其视为一种折中方法,具有以关系正确的方式存储属性的许多实际优点:




  • 可以适当地使用SQL数据类型,

  • 您可以使用约束或查找表来按照一组预定义的值限制某些属性。

  • 您可以使属性强制(即NOT NULL)或使用其他约束。

  • 数据和索引的存储效率更高。

  • 查询更容易为了编写RDBMS以便更容易地执行。



有关此设计的更多信息,请参阅Martin Fowler的书企业应用程序架构模式或我的演示文稿 SQL中的面向对象的实用模型,或者我的书, SQL反模式:避免数据库编程的陷阱


Suppose I have the following tables:

     ____________________             ____________________
    |     Organisms      |           |       Species      |
    |--------------------|           |--------------------|
    |OrganismId (int, PK)|           |SpeciesId (int, PK) |
    |SpeciesId (int, FK) |∞---------1|Name (varchar)      |
    |Name (varchar)      |           |____________________|
    |____________________|                      1
              1                                 |
              |                                 |
              |                                 |
              ∞                                 ∞
    ______________________        ____________________          _______________
   | OrganismPropsValues  |      |   SpeciesProps     |        |     Props     |
   |----------------------|      |--------------------|        |---------------|
   |OrganismId (int, FK)  |      |PropId (int,PK,FK)  | ∞-----1|PropId (int,PK)|
   |PropId (int, FK)      |      |SpeciesId(int,PK,FK)|        |Name (varchar) |
   |Value (varchar)       |      |____________________|        |_______________|
   |______________________|                                             1
              ∞                                                         |
              |                                                         |
              -----------------------------------------------------------

A quick explanation of what I am trying to represent here: suppose we have a list of species, such as cat, dog, human, etc. We also have a set of properties (abbreviated Props so I could fit it more easily in the diagram) which apply to some but not necessarily all species--for example, this may be tail length (for species with tails), eye color (for those with eyes), etc.

SpeciesProps is a linker table that defines which properties apply to which species-- so here we would have {Human, Eye Color}, {Dog, Eye Color}, {Cat, Eye Color}, {Dog, Tail Length}, {Cat, Tail Length}. We do not have {Human, Tail Length} because Tail Length is obviously not a valid property to apply to a human.

The Organisms table holds actual "implementations" of the species-- So here we might have {Human, Bob}, {Dog, Rufus}, and {Cat, Felix}.

Here is now my issue: in the OrganismPropsValues table, I want to store the 'values' of the properties for each organism--so for example, for Bob I want to store {Bob, Eye Color, Blue}. For Rufus, I would want to store {Rufus, Eye Color, Brown} and {Rufus, Tail Length, 20} (similar for Felix). My problem however, is that in the schema that I have detailed, it is perfectly possible to store {Bob, Tail Length, 10}, even though the {Human, Tail Length} tuple does not exist in SpeciesProps. How can I modify this schema so I can enforce the constraints defined in SpeciesProps in OrganismPropsValues, while maintaining adequate normalization?

解决方案

You're implementing the Entity-Attribute-Value antipattern. This can't be a normalized database design, because it's not relational.

What I would suggest instead is the Class Table Inheritance design pattern:

  • Create one table for Organisms, containing properties common to all species.
  • Create one table per species, containing properties specific to that species. Each of these tables has a 1-to-1 relationship with Organisms, but each property belongs in its own column.

     ____________________             ____________________
    |     Organisms      |           |       Species      |
    |--------------------|           |--------------------|
    |OrganismId (int, PK)|           |SpeciesId (int, PK) |
    |SpeciesId (int, FK) |∞---------1|Name (varchar)      |
    |Name (varchar)      |           |____________________|
    |____________________|
              1
              |
              |
              1
     ______________________ 
    |    HumanOrganism     |
    |----------------------|
    |OrganismId (int, FK)  |
    |Sex      (enum)       |
    |Race     (int, FK)    |
    |EyeColor (int, FK)    |
    |....                  |
    |______________________|
    

This does mean you will create many tables, but consider this as a tradeoff with the many practical benefits to storing properties in a relationally correct way:

  • You can use SQL data types appropriately, instead of treating everything a free-form varchar.
  • You can use constraints or lookup tables to restrict certain properties by a predefined set of values.
  • You can make properties mandatory (i.e. NOT NULL) or use other constraints.
  • Data and indexes are stored more efficiently.
  • Queries are easier for you to write and easier for the RDBMS to execute.

For more on this design, see Martin Fowler's book Patterns of Enterprise Application Architecture, or my presentation Practical Object-Oriented Models in SQL, or my book, SQL Antipatterns: Avoiding the Pitfalls of Database Programming.

这篇关于SQL:在保留约束的情况下规范化数据库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆