数据库设计:第一正常表格和重复组 [英] DB Design: 1st Normal Form and Repeating Groups

查看:220
本文介绍了数据库设计:第一正常表格和重复组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

要遵守第一正常形式,您必须避免的其中一件事是重复组。如下:

  CustID名称地址Phone1 Phone2 Phone3 

102 Jerry 234 East .. 555 -2342 555-9854 555-2986

您应该创建一个第二个电话号码表,然后在一个联接你会得到:

  CustID名称地址电话

102 Jerry 234 East .. 555-2342
102 Jerry 234 East .. 555-9854
102 Jerry 234 East .. 555-2986

有时,这有点更加含糊,很难说一组列头是否符合条件。例如,假设你现在有两个测试,每个硬件上运行。而您的第一个DB设计产生了最为横向的方法:



设计1

  SN Test1_Max Test1_Min Test1_Mean Test2_Max Test2_Min Test2_Mean 
2093 23 2 15 54 -24 45

显然,这是一个重复的组,可以更容易地被表示为(在零件和测试之间的连接)中:



设计2

  SN测试最大最小平均值
2093 1 23 2 15
2093 2 54 -24 45

然而,你可以走得更垂直:



设计3

  SN测试统计值
2093 1最高23
2093 1最小2
2093 1平均15
2093 2最大54
2093 2 Min -24
2093 2平均45

设计3是否必要?你如何决定如何垂直呢?设计2和3之间的优缺点是什么?看起来两者都可以使用SQL来选择或者很容易地连接起来,并且给予了设计3的优势,因为你可以轻松添加一个新的统计数据,而不需要修改表格结构。



但是在任何人都说,越垂直越好,有时候它更加模糊。喜欢:



设计4

  SN平均电流(mA)电池容量(mA)
2093 200 540

可以改为: / p>

设计5

  SN mA_Measuremnt Value 
2093 AverageCurrent 200
2093 BatteryCapacity 540

虽然这两个属性是一样的域(mA),它们代表与组件有很大不同的事情。在这种情况下,设计4是否更好,因为它不是严格的重复组?我想我正在寻找的是一些标准,知道什么时候将它分解成更多的表,从而使它更垂直。为了总结这个荒谬的长时间的问题,你应该删除并规范化重复的组,如果它们是相同的域,而且有完全相同的含义?。如果是这样的话,那么真正只有电话的例子,可能是设计1中的两个测试才符合这个标准。虽然设计3和5似乎可能具有设计优势,尽管设计3的统计数字严格来说具有不同的含义,而平均电流和电池容量在设计5中绝对具有不同的含义。

解决方案

设计2和设计4是最好的方法,只要结果不总是存在(也称为Desigin 1中的空值)。如果他们总是被拿走,那么第一个设计是好的。



我相信在SQL中重复的​​组实际上是如果你有一个填充了add'l值的列, Phone_Number包含123-444-4444,123-333-3334等。



无论如何,后来的设计是次优的 - 你继续把它放到最后一级并拥有一个真正的查找表 http://www.dbazine.com/ofinterest/ oi-articles / celko22 或实体属性值 http://tonyandrews.blogspot.com/2004/10/otlt-and-eav-two-big-design-mistakes.html



http:// asktom。 oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:10678084117056



无论哪种方式,几乎总是坏事。尽管它们可能共享一个通用的数据类型/域,但意义不同,因此它们应该保留个别属性(maxtemp,mintemp等)。


To adhere to 1st normal form, one of the things you must avoid is repeating groups. As in instead of:

    CustID  Name  Address       Phone1      Phone2       Phone3

     102    Jerry  234 East..   555-2342   555-9854     555-2986

You should create a second Telephone Number table and then on a join you would get:

CustID  Name     Address       Phone

102 Jerry    234 East..   555-2342
102 Jerry    234 East..   555-9854
102 Jerry    234 East..   555-2986

Sometimes, it's a little more ambiguous and it's hard to tell when a group of column headers qualify. For instance, let's say you have, at the moment, two tests you run on every piece of hardware. And your first DB design yields the most horizontal approach:

Design 1

SN     Test1_Max   Test1_Min    Test1_Mean  Test2_Max   Test2_Min    Test2_Mean
2093      23          2            15         54          -24           45  

Obviously, this is a repeating group, that could much more easily be represented as (on a join between "Parts" and "Tests"):

Design 2

SN     Test      Max    Min    Mean     
2093    1        23     2      15       
2093    2        54     -24     45      

However, you could go even more vertical:

Design 3

SN     Test    Statistic    Value
2093    1        Max          23
2093    1        Min          2
2093    1        Mean         15       
2093    2        Max          54
2093    2        Min         -24
2093    2        Mean         45  

Is Design 3 necessary? How do you decide how vertical to make it? What are the pros and cons between Design 2 and 3? It seems that both could be selected or joined easily with SQL, with the advantage given to Design 3 because you could easily add a new Statistic without actually modifying the table structure.

But before anyone goes and says that the more vertical the better, there are times where it's more ambiguous. Like:

Design 4

SN      AverageCurrent (mA)    BatteryCapacity (mA)  
2093          200                    540  

Could instead be:

Design 5

SN      mA_Measuremnt       Value
2093    AverageCurrent      200 
2093    BatteryCapacity     540 

While both attributes are of the same domain (mA), they represent very different things in regards to the component. In this case, is Design 4 better since it's not strictly a repeating group? I guess what I'm looking for is some criteria to knowing when to break it down into more tables and thus make it more vertical.

To sum up this ridiculously long question, should you only remove and normalize repeating groups if they are exacly the same domain and have the exact same meaning?. If that is the case, then really only the telephone example and probably the two tests in Design 1 meet this criteria. Though it seems like there might be design benefits to Design 3 and 5, even though the statistics of Design 3 have different meanings strictly speaking, and AverageCurrent and BatteryCapacity definitely have different meanings in Design 5.

解决方案

Design 2 and Design 4 are the best ways to go provided the results will not always be present (aka NULLs in Desigin 1). If they always are taken, then the first design is fine.

I believe repeating groups in SQL would actually be if you have a column stuffed with add'l values e.g. Phone_Number contains "123-444-4444,123-333-3334" etc.

Anyway, the later designs are suboptimal -- you continue to take that to the final level and have the "One True Lookup Table" http://www.dbazine.com/ofinterest/oi-articles/celko22 or Entity Attribute Value http://tonyandrews.blogspot.com/2004/10/otlt-and-eav-two-big-design-mistakes.html

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:10678084117056

Either way, it's almost always a bad thing. Although they may share a common datatype/domain, the meaning differs -- thus they should remain individual attributes (maxtemp, mintemp, etc.)

这篇关于数据库设计:第一正常表格和重复组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆