在规范化过程中创建非规范化表的现有做法/准则? [英] Exist practices/guidelines for creation of non normalized tables during the normalization process?

查看:122
本文介绍了在规范化过程中创建非规范化表的现有做法/准则?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我开始自学数据库的基础知识,目前我正在研究1.到3.普通形式.到目前为止,我所了解的是希望删除冗余以使我的数据库在数据更改阶段不太容易出现不一致的情况,并希望通过消除尽可能多的重复项来节省空间.

例如,如果我们有一个包含以下列的表格:

  1. CD_ID
  2. 标题
  3. 艺术家

,然后将设计更改为具有多个表,其中第一个(CD)包含:

  1. CD_ID
  2. 标题
  3. artist_ID

第二个(艺术家)包含:

  1. artist_ID
  2. 艺术家

我看到在原始表中,年份通过艺术家临时取决于ID.因此,我们想摆脱这一点,并为艺术家创建一张桌子,使我们的新CD桌子现在处于第三普通形式.

但是为此,我创建了另一个表(艺术家表),据我所知,该表又不是第三范式,因为我们具有与之前在另一个表中相同的传递依赖类型.

这是正确的吗?如果是的话,我还应该将artist表归一化为3 NF吗?我什么时候停止?

解决方案

TL; DR 您需要遵循已发布的算法来分解为给定的标准格式.

PS因为引入了新的专栏,所以您没有通过归一化从原始CD中获得Artist.但是假设表Artist具有明显的意义.为什么您认为它据我所知仍然不是第三范式"?如果原始CD中的artist-> year,则在Artist中也是如此.但是随后{artist}与{artist_id}一起是Artist的CK(候选密钥),而Artist位于3NF(和5NF)中.


从问题的原始版本到当前版本,您将获得一个建议的基本表CD,其中的列cd_id,title,group&年,持有元组,其中由 year 年形成的组 group 制作了名为 title 的cd cd_id .列cd_id是唯一的,因此是CK. FD {group}->年也成立.

规范化不会引入新的列名.它用其他人替换了一个提议的基表,每个基表都有其列的较小子集,这些基表始终连接到其原本的值.到BCNF的归一化基于FD(功能依赖性),FD也决定了基表的CK.因此,您的问题不包含分解.可能使您联想到您的问题的分解(可能具有或不具有任何特定的属性)将是具有列集{cd_id,title,group}和{group,year}的表.

其他FD保留在原件中.有些人因为列是什么而持有;由于CK而有些搁置;因为{group}-> year成立,所以有些持有;通常,某些人持有,因为这三个人都这样做.也许其他人之所以持有,是因为应该将哪些元组纳入该关系中以及可能出现什么情况.您需要确定每个可能的FD是否成立.

当然,您可能已经被告知,在这种情况下,唯一持有的是那些持有的股票.但是您不会被告知,拥有的唯一FD是{group}->年,因为其中有 个琐碎的FD和 CK的每个超集从功能上确定每组列.

3NF的一个定义是,关系位于2NF中,并且没有非素数列在功能上依赖于任何CK. (请注意,每个条件都涉及其他定义.)如果要使用它来确定您的关系是否在3NF中,则接下来需要找出CK的 all .您可以通过适当的算法最快地执行此操作,但是您只能看到哪些列集在功能上确定了每一列,但其中不包含更小的此类集,因为它们是CK.然后检查定义中的两个条件.

如果要归一化为3NF,则需要遵循一种算法分解为3NF.您没有解释您认为应该遵循的流程.但是,如果您未遵循经过验证的算法,那么您选择的任何组件可能会或可能不会始终加入原始组件,并且可能会或可能不会都以任何特定的较高范式形式存在.请注意,您看到的分解示例不是分解算法的表示形式.


NF(标准格式)定义给出了关系必须满足的条件才能包含在该NF中.他们没有告诉您如何对较高NF的关系进行无损分解(尽可能保留FD).人们已经研究出了生成特定NF分解的算法. (而且分解为给定的NF通常不涉及先分解为较低的NF.经过较低的NF实际上可以防止在按较高的NF进行分解时生成原始的较高NF的良好分解.)

您可能还没有意识到,某些FD持有时,某些其他FD必须持有.后者可以通过前者的阿姆斯特朗公理来确定.因此,仅仅因为您分解摆脱了某个FD,而该FD的存在违反了特定的NF,并不意味着就不会有您不曾处理过的其他违反它的FD.它们可以存在于新组件中.否则它们可能不会以有问题的方式出现,从而使您无法在可能的情况下保留"它们,从而导致设计不佳.

在大学/大学教科书/课程/演示文稿中,了解特定的NF算法,以及有关NF和规范化本身的信息.许多在线.

I started to teach myself the basics of databases and i am currently working through 1. to 3. normal forms. What i understand until now is the wish to remove redundancy to make my databases less prone to inconsistency during phases of data-change as well as saving space by eliminating as much duplicates as possible.

For example if we have a table with the following columns:

  1. CD_ID
  2. title
  3. artist
  4. year

and change the design to have multiple tables where the first (CD) contains:

  1. CD_ID
  2. title
  3. artist_ID

the second (artist) contains:

  1. artist_ID
  2. artist
  3. year

I see that in the original table the year is transitively dependent on the ID via the artist. So we wanna get rid of that and create a table for the artists so our new CD table is now in third normal form.

But to do so i created another table (the artist table) which again is not in third normal form as far as I understand it, as we have the same type of transitive dependency like before just in another table.

Is this correct and if yes should i also normalize the artist table to be in 3rd NF? When do I stop?

解决方案

TL;DR You need to follow a published algorithm to decompose to a given normal form.

PS You didn't get Artist from the original CD via normalization, since you introduced a new column. But assume table Artist has the obvious meaning. Why do you think it "again is not in third normal form as far as I understand it"? If artist -> year in the original CD then it also does in Artist. But then {artist} is, with {artist_id}, a CK (candidate key) of Artist, and Artist is in 3NF (and 5NF).


From your question's original version plus the current one, you have a proposed base table CD with columns cd_id, title, group & year, holding tuples where cd cd_id titled title was made by group group that formed in year year. Column cd_id is unique, hence is a CK. FD {group} -> year also holds.

Normalization does not introduce new column names. It replaces a proposed base table by others, each with a smaller subset of its columns, that always join to what its value would have been. Normalization up to BCNF is based on FDs (functional dependencies), which are also what determine the CKs of a base table. So your question does not contain a decomposition. A possible decomposition reminiscent of your question, which might or might not have any particular properties, would be to tables with column sets {cd_id, title, group} and {group, year}.

Other FDs hold in the original. Some hold because of what the columns are; some hold because of the CK; some hold because {group} -> year holds; in general, certain ones hold because all three of those do. And maybe others hold because of what tuples are supposed to go into the relation and what situations can arise. You need to decide for every possible FD whether it holds.

Of course, you might have been told that the only ones that hold are the ones that have to hold under those circumstances. But you won't have been told that the only FD that holds is {group} -> year, because there are trivial FDs and every superset of a CK functionally determines every set of columns.

One definition of 3NF is that a relation is in 2NF and no non-prime column is transitively functionally dependent on any CK. (Notice each condition involves other definitions.) If you want to use this to find out whether your relation is in 3NF then you next need to find out what all the CKs are. You can do this fastest via an appropriate algorithm, but you can just see which sets of columns functionally determine every column but don't contain a smaller such set, since those are the CKs. Then check the two conditions in the definition.

If you want to normalize to 3NF then you need to follow an algorithm for decomposing to 3NF. You don't explain what process you think you should follow. But if you aren't following a proven algorithm then whatever components you pick might or might not always join to the original and might or might not each be in any particular higher normal form. Note that examples of decompositions you have seen are not presentations of decomposition algorithms.


The NF (normal form) definitions give conditions that a relation must meet to be in that NF. They don't tell you how to nonloss decompose (preserving FDs when possible) to relations in higher NFs. People have worked out algorithms for producing decompositions to particular NFs. (And decomposing to a given NF doesn't in general involve first decomposition to lower NFs. Going through lower NFs can actually prevent good higher-NF decompositions of the original from being generated when you get to decomposing per a higher NF.)

You may also not realize that when some FDs hold, certain other ones must hold. The latter can be determined via Armstrong's axioms from the former. So just because you decomposed to get rid of a particular FD whose presence violates a particular NF doesn't mean there weren't a bunch of other ones that violated it that you didn't deal with. They can be present in the new components. Or they can be not present in problematic ways, so that you have not "preserved" them when you could have, leading to poor designs.

Learn about specific NF algorithms, and for that matter NFs and normalization itself, in a college/university textbook/course/presentation. Many are online.

这篇关于在规范化过程中创建非规范化表的现有做法/准则?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆