数据库模式相关的问题 [英] Database schema-related problem

查看:89
本文介绍了数据库模式相关的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个关于数据库的理论问题。为了让它更具体,我已经考虑了一个例子。



假设我有一个商店与产品。我有很多不同的产品。并非所有产品都具有相同的适用性能。例如,我可以以千兆字节定义硬盘的大小,但不能在CPU上使用相同的属性,因为它不适用。我想要的是一个数据库,我可以动态地向产品添加属性。我唯一可以想出的是:



一个产品表,一个ID,一个名称和一个描述。



具有ID,Product_ID,属性和值的一个属性表。



这样我可能会得到一个巨大的,我认为不是效率,属性表。这已经很麻烦了我很久了。有没有人知道我的问题更好的解决方案?

解决方案

这正是向第六正常表格移动,只是那些没有学术或经验背景的人不知道(a)它的名称和(b)规则和注意事项。这些人已经实现了通常被认为是实体属性值或EAV。如果做得好,就可以了,在这样的表格里面有数以千计的医疗系统带有诊断和剂量信息。如果不是,那就是一只狗的早餐使用和维护。


  1. 首先确保你有产品的真实和完整的5NF。


  2. 始终使用完整的声明参照完整性; CHECK 约束和规则


  3. 从不将所有内容放在一个表中,其中包含$ code> VARCHAR()。始终使用正确的(适用的)数据类型。这意味着你将有几个表,每个DataType有一个表,并且没有失去控制或完整性。


  4. 同样,任何关联表对另一个表[例如供应商]的多次引用必须是分开的。




    • 我提供了一个完整的控制讨论的数据模型;它包括一个简单的目录,可用于验证以及导航。您需要添加每个 CHECK 约束和 RULE ,以确保数据和引用完整性不会丢失。这意味着,例如:

      • for CPUSpeed 列,它存储在 ProductDecimal CHECK 它的值在适当的值范围内

      • 为每个子 - 产品 CHECK DataType对于 ProductType-ColumnNo 组合是正确的


    • 这种结构比大多数EAV都好,而不是完整的6NF。



  5. 将所有必填栏保留在 Product 中;仅使用子产品表作为可选列。


  6. 对于每个这样的(例如 Product )表,您需要创建一个View(虚线),它将从EAV / 6NF表中构造5NF行。您可能有几个视图: Product_CPU Product_Disk


  7. 不要通过视图进行更新。将所有更新事务保存在存储过程中,并插入或更新每个列(即产品子产品表格,适用于每个特定的 ProductType )。


  8. 巨大?商业数据库(而不是免费软件)对于大型表或连接没有任何问题。这实际上是一个非常有效的结构,并允许非常快速的搜索,因为这些表实际上是面向列的(不是面向行的)。如果人口是巨大的,那么它是巨大的,做你自己的算术。


  9. 你需要一个表,一个查找表属性(或属性)。这是目录的一部分,基于 ProductType


更好的解决方案是要获得正式的第六正式表格。如果您只有一个或几个需要可选列的表,则不需要。



要清楚:




  • 第六种正常表格是该行由主键和最多一个属性组成。


  • 这是6NF(至少为产品表集群) ,然后通过DataType再次归一化(不是正常格式),以减少表的数量(否则每个属性将有一个表格)。


  • 这保留了完整的Rdb控制(FK,约束等);而普通的EAV类型并不会影响DRI和控制。


  • 这也是目录的基础。




链接到产品群集数据模型



链接到IDEF1X符号 。 >

更新



您可能对此 ▶5NF 6NF讨论◀ 。我会在某些时候写出来。


I have a kind of theoretical question about databases. To make it more concrete I have thought up an example case.

Suppose I have a store with products. I have lots of different products. Not every product has the same applicable properties. For instance, I could define the size of a harddisk in gigabytes, but cannot use that same property on a CPU, simply because it does not apply. What I want is a database where I can add properties to products dynamically. The only thing I can come up with is the following:

One product table with an ID, a Name and a Description.

One properties table with an ID, Product_ID, Property and a Value.

This way I would potentially get a gigantic, I-suppose-not-so-efficient, table of properties. This has been bugging me for a long time now. Does anyone know a better solution to my problem?

解决方案

This is actually moving towards Sixth Normal Form, it is just that people like you who do not have the academic or experiential background do not know the (a) name for it and (b) the rules and the caveats. Such people have implemented what is commonly know as Entity-Attribute-Value or EAV. If it is done properly, it is fine, and there are many thousands of medical system out there carrying diagnostic and dosage info in such tables. If it is not, then it is one dog's breakfast to use and maintain.

  1. First make sure you have Product in true and full 5NF.

  2. Always use full Declarative Referential Integrity; CHECK constraints and RULES.

  3. Never put all that into one table with a VARCHAR() for Value. Always use the correct (applicable) DataTypes. That means you will have several tables, one each per DataType, and there is no loss of control or integrity.

  4. Likewise any Associative tables (where there is a multiple reference to another table [eg. Vendor] ) must be separate.

    • I am providing a Data Model which has the full control discussed; it includes a simple catalogue which can be used for verification as well as navigation. You need to add every CHECK Constraint and RULE to ensure that the data and referential Integrity is not lost. That means, eg:
      • for the CPUSpeed column, which is stored in ProductDecimal, CHECK that it is in the proper range of values
      • for each sub-Product table CHECK that the DataType is correct for the ProductType-ColumnNo combination
    • This structure is way better than most EAV, and not quite the full 6NF.
      .
  5. Keep all the mandatory columns in Product; use the sub-Product tables for optional columns only.

  6. For each such (eg Product) table, you need to create a View (dotted line), which will construct the 5NF rows from the EAV/6NF tables. You may have several Views: Product_CPU, Product_Disk.

  7. Do not update via the View. Keep all your updates transactional, in a stored proc, and insert or update each of the columns (ie. the Product and sub-Product tables which are applicable, for each particular ProductType) together.

  8. Gigantic ? Commercial databases (not the freeware) have no problems with large tables or joins. This is actually a very efficient structure, and allows very fast searches, because the tables are in fact column-oriented (not row-oriented). If the population is gigantic, then it is gigantic, do your own arithmetic.

  9. You need one more table, a Lookup table for Property (or Attribute). That is part of the catalogue, and based on ProductType

The better solution is to go for full, formal Sixth Normal Form. Not necessary if you have only one or a few tables that require optional columns.

To be clear:

  • Sixth Normal Form is The Row consists of the Primary Key and, at most, one Attribute.

  • This is 6NF (for at least the Product table cluster), then Normalised again (Not in the Normal Form sense) by DataType, to reduce the no of tables (otherwise you would have one table per Attribute).

  • This retains full Rdb control (FKs, constraints, etc); whereas the common EAV types don't bother with the DRI and control.

  • This also has the rudiments of a catalogue.

Link to Product Cluster Data Model

Link to IDEF1X Notation for those who are unfamiliar with the Relational Modelling Standard.

Update

You might be interested in this ▶5NF 6NF Discussion◀. I will write it up at some point.

这篇关于数据库模式相关的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆