如何在关系中设置主键？ [英] How to set up Primary Keys in a Relation?

查看：122 发布时间：2017/3/22 1:12:50 database database-design relational-database database-schema entity-relationship

本文介绍了如何在关系中设置主键？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想知道如何在关系中正确设置主键。例如。我们有 ER-diagram ，其中包含元素：

主要属性

弱键属性

识别关系

关联实体

为了将其转化为关系模型，我们应该做一些技巧。上述所有元素处理关键的主键，但都是自然键 - 所以我们可以将它们保留为或替换为替代键。

考虑一些情况。

案例1

密钥属性是一个名称 - 所以它必须是键入 CHAR 或 VARCHAR 。一般名称成为主要属性。

案例2

两个（或更多） 识别关系成为一个关系的复合主键（由外键）

案例3

识别关系与弱键属性也成为复合主键。

案例4

关联实体通常有两个或更多识别关系，以便他们成为交接关系（连接表）。

如何为关系设置主键，以便处理所有上述情况（也许一些更多的案例，我没有提到）？

如何避免使用代理键，在哪些情况下是必需的？ p>

如何设置主键的数据类型？

如果复合主键必须被传递到儿童关系中，是否应该用替代代替？

在我的使用代码键的优点和缺点查看：

优点

它们紧凑（通常为键入 INT ），有时候可以很好地替代复合键

在外键

他们无痛索引

缺点

它们是数字和无意义的。例如。我希望在我的界面应用程序中填写连接表，所以我将别无选择，只能与数字相关。

他们是多余的

他们很困惑

关于设置数据类型，必须有更多的技巧以及整体设置主键。

更新

我最初应该给出一个例子，但我没有。所以这里是一个例子。
考虑到我们有两个相互交互的主体（仍然不知道如何说明这样的事情，所以我将把它们显示为展示国际空间站船员轮岗系统的表格）：

SpaceShip

 ╔═══════════════════════════════$║║║│║║║║║║║║║║║║ Key 
╟────────────────────────────────────────────────────────────────────────────────────────────────╢ $ b║联盟TMA-14│Soyuz║在这里没有被考虑）
║奋斗│航天飞机║
║联盟TMA-15M│联盟║
║亚特兰蒂斯│航天飞机║
║Soyuz TM-31│联盟║
║...│...║
╚═══════════════════════ ═════════╝

而 Crew

 ╔═══════════════════ ╗
║CrewId│SallSign║CrewId  - 主键（用于Id的案例人员通常是
╟─────────────────────────────────────────────────────────────────────────────────成员 - 它没有特定的
║4243│Astreus║名称）
║4344│Altair║CallSign  - 属性（可能不被赋值或
║4445│...║明确显示 - 即它可以是NULL）
║...│...║
╚══════════════════$ $ $ $ $ $ code>

这两个实体通过 Flight 进行交互。每次飞行都可以送往国际空间站一个机组人员并返回另一名或同一组人员。显然， Flight 和 Crew 之间的关系是多对多关系，它需要连接关系（表）。但是，由于宇宙飞船，我们不能仅仅将 SpaceShip 和 Crew 相关联 - 太空船可以重复使用（可返回），例如

 
 
 所以航班应该如下所示：
 ╔═══════════════════════════════════ ══════════════$║║║║│││││││║║║║║──Name Name Name───────────────── ────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 31│NULL│1│...║ShipFlightNum 
║亚特兰蒂斯│STS-117│28│...║取决于整个
║Soyuz TMA-14│NULL│1│...║复合PK 
║奋斗│STS-126│22│...║...  - 其他
║联盟TMA-15M│空│1│...║属性哪些
║奋斗│ STS-111│18│...║取决于PK 
║亚特兰蒂斯│STS-122│29│...║
║...│...│...│... ║
╚════════════════════════════════════════════ ╧═════╝
  
所以航班具有复合主键（联盟号的航班名称与航天器的名称相同，但对于可重复使用的航天器，例如航天飞机）和它需要与 Crew 与多对多关联。这是我的复杂问题的一部分 - 如果这个复合主要自然键应该替换为代理一个？
 
如果我们要使用自然键，然后新的交叉关系（关联实体）应如下所示：
 
 
  指定（船员设计为航班）
 ╔ ═══════════════════════════════════ b $ b║ShipName│FlightName│CrewId│CrewType║
╟──────────────────────────────────────────────────────────────────────────────────────────── ─────────────$ b────╢╢╢││││││││││││║║║║║│││││││││║║║║ $ b║联盟TMA-15M│空│4445│备份║
║联盟TMA-16M│空│4344│交付║
║联盟TMA-17M│空│4445│交付║
 ║联盟TMA-18M│空值│4344│返回║
║奋斗│STS-111│55│交付║
║奋斗│STS-111│44│返回║
║奋斗│STS -113│55│返回║
║...│...│...│...║
╚════════════════ ═══════════════════════════$ $ $ $ $ $ $ $ $ $ 
 
 这里我们有4x <复合主键，它由四个外键组成（CrewType也有FK约束）。如果我们使用代理而不是 Naturals ，那么结果会更紧凑，但很难填满（在我看来）。 
 
 
 另一个更新
 
 
  另一种情况表（关系） TypeCrew ：
 ╔═════════╗
║CrewType║
╟────────╢
║交付║
║返回║
║备份║
║... ║
╚═════════╝
  
 Everyhing将会很好如果我们没有在我们的查询中使用这些值（ WHERE CrewType LIKE'Backup'）。如果这些值将替换为其他语言中的替代含义，或者甚至用符号例如。 > ，< 和 ^  交付，分别返回和备份（ WHERE CrewType LIKE'^'）。添加数字代理键不会太多，因为它的值可能与 TypeName不一致（ WHERE TypeId = 2 ）：
 ╔═══════════════════ ╔══════════════════════════════════ ║║TypeId│TypeName║║TypeId│TypeName║
╟────────────────────────────────────────────────────────────────────────────────────────────────── ║║╟╟║││╟╟╟║║║╢╢║║║││║║║║║║││││││││ ║
║1│返回║║1│交付║║1│< ║
║2│备份║║2│返回║║2│^║
║...│...║║...│...║║...│... ║
╚══════════════════════════════════════════════════ ════════════╝
  
也许这是关系模型不是问题？也许这只是坏的设计？ 
解决方案
 
位置
 
 
 任何练习那不是固体理论不是值得考虑的。我是一名严格的关系模范从业者，在理论上有很强的依据。关系模型基于固体理论，从未被驳斥¹。在关系理论中传出的东西没有什么固有的，我把它们带入了他们的空间，反驳了他们的观念。此外，关系数据库设计是一种科学而不是魔术，而不是艺术²，因此我可以为我所做的任何命题或指示提供证据。我的答案是从那个位置。
 
 
  ^{1。非科学论文，不了解科学的人群群众的意见是对的，但没有科学的驳斥。很像侏儒认为人不能飞，对他们来说是真实的，但对人类来说不是真的，它是基于完全无法理解飞行原则的。} 
 
 
  ^{2。高端从业人员介绍中有一些艺术作品，但是这并不能使科学成为艺术。这是一门科学，只有一门科学，除此以外，它可以在模型和数据库中巧妙地交付。}
 
 
 关系理论 
 
 
 
 
  我想知道如何在关系中正确设置主键。例如。我们有包含元素的ER图： 
 
 
如果是ERD，那么你不会看关系，你会看到实体（如果图是早期的）或表（如果进行的话）。 关系是一个很好的抽象，与实现无关。 ERD或数据模型意味着实现（非抽象的，真实的）意图，物理意图离开抽象世界的理论背后，并进入物理世界，愚蠢的抽象被摧毁。另外，声称服务于数据库空间的理论家也不能区分基础关系和派生关系：虽然在抽象背景下这可能是可以接受的，但是它是死的执行上下文错误例如。基础关系是表，需要归一化;派生关系是，良好的，衍生的，基础关系的观点，根据定义，基础关系是扁平化视图（不是非规范化，这意味着略有不同）。因此，它们不需要归一化。 
 
 
  
 但是理论家试图归一化衍生关系。最坏的两个人正在试图对1NF的定义，我们已经有四十五年了，这是根本和坚定的，他们自己已经支持，改变，所以他们的派生关系，不需要规范化，可归类为归一化。如果不是那么难过，这将是非常有趣的。
 
 
 
 
 科学的客观真理的一个奇妙的质量是，它不会改变。主观的真理，非科学，一直在变化。一个可以依靠，在实践之前必须要理解，另一个不值得阅读。
 
 
 隔离
 
 
 他们生活在一个自己的世界，与关系数据库的实际，特别是关系模型以及他们所声称的行业隔绝。自从 RM 出现以来的四十五年里，他们没有进行任何进展 RM 或关系数据库。 
 
 
  
  请注意，他们已经进行各种概念，这些概念在关系模型之外。 
 
 
    RM 的完成（尼安德特人建议的不完整）完全是由于标准化R Brown等人与Codd合作，产生了IDEF1X关系数据库建模标准），以及高端SQL供应商及其客户的努力。 
 
 
  这是在20世纪80年代已经建立的商业RDBMS供应商，而不是过去十年的非SQL免费软件/共享软件/组件将他们的商品作为sql传递给您，这使您很好，并粘贴到他们的平台，不可移植。
 
 
 
 
 
 <最糟糕的是，他们发表关于他们的非关系概念的书籍，并将其标榜为关系型。而教授盲目地教这个废话，就像鹦鹉一样，没有理解它应该探索的废话或者关系模型。
 
 
  
  如果您正在寻找某些教育项目的答案，抱歉，我无法提供，因为您可以看到的教育完全混淆，并且具有非关系性要求。
 
 
  然而，我可以直接回答问题，由科学，关系模型 ，物理学等等。
 
 
 
 
 
 从这一点来看，关系理论与实践非常在EF Codd博士发表了他的开创性工作之后，在供应商开发SQL平台的时代，在后Codd时代，关系理论的传递完全脱离了原始的关系理论。 
 
 
  
 我可以枚举差异，但不在这里。请注意，如果您阅读我关于此主题的帖子，您可以收集这些细节，并自己列举它们。或者问一个新的问题。
 
 
 
 
 问题
 
 
 
 
  我想知道如何在关系中正确设置主键。例如。我们有包含元素的ER图： 
 
 
没有ERD要检查。好的，在更新你有一个例子。完美的您的问题，因为它是一组数据的用户视图，现在建模可以开始。但请注意，这不是ERD或模型。我们依靠理解数据;分析它将其分类，而不是用显微镜观察数据值。我意识到这是你已经被教导做的。
 
  为了将其翻译成关系模型 
 
 
是的，这是既定的目标。 翻译这个词是不正确的，因为 RM 不仅仅是一个满足或适合于（如理论家所知）的一套或一组固定的标准，它也是提供具体的方法和规则。因此，我们将根据关系模型 建模。
 
  我们应该做一些技巧 
 
 
我们不需要技巧，我们用科学，只有科学。跟随他们的理论家和教授需要技巧，练习非科学。我不能在这方面提供帮助。此外，他们使用的技巧通常是规避和颠覆关系模型，因此请注意。
 
 
 代理
 
 
 
 
  上面的所有元素都处理关键的主键，但都是自然键 - 所以我们可以将它们留下或替换为代理键。  
 
 
嗯，那是你的老师的第一招是暴露出来。
 
  代理是物理记录（不是行）指针，它们不合逻辑。
 
 
  没有像代理关键这样的东西，这两个词相互矛盾。 
 
 
  
  一个密钥在 RM 中有一个特定的定义，它必须是从数据。代理不是从数据中构成的，它是由系统生成的无意义的数字。因此，它不是Key或key。 
 
 
     有一些关键素质，这使得Key非常强大。由于代理不是Key，它没有任何这些素质，它没有关系力量。
 
 
  因此，代理和键每个都有具体含义，它们作为单独的术语是相当好的，但一起来说，它们是自相矛盾的，因为它们是对立的。
 
 
  当人们使用它们时，代理关键，他们自然期望一些，如果不是全部，一个钥匙的品质。但他们不会得到任何一个。因此他们被欺骗。
 
 
 
 
 
  关系模型理论家们一无所知）具有特定的访问路径独立规则。只要使用关键键，就会维护这个规则。它提供了关系完整性¹。 
 
 
  
  使用代理违反此规则。结果²是，关系完整性和关系导航³都丢失了。 
 
 
  其结果是，需要更多的联接才能获得相同的数据（而不是更少，因为神话和魔术的恋人不断追求）。 
 
 
  因此，代理人不允许，另一个单独计数。
 
 
 
 
 
  由于您处于建模阶段，无论是概念还是逻辑，而钥匙是逻辑的，代理是物理的，代理人不应该进入图片。 （他们进入图片，如果有的话，只有当逻辑模型完成，物理模型被考虑时才考虑）。你没有完成逻辑，所以引入代理应该提高一个红色旗。 
 
 
 老师和他正在使用的教科书的作者，是两个不同的计数：
 
 
  
  他们在逻辑练习中引入了一个物理字段，它不应该关心数据库的物理方面。 
 
 
  但是，在这样做的过程中，他们所拥有的效果是他们将代理人物理身体建立为逻辑的东西。因此，他们毒害了心灵。
 
 
 
 
 
 
纯粹的逻辑，没有受到疯狂思考的污染，从而免于欺诈。在逻辑阶段没有代理。
 
 
  ^{1。关系完整性（其关系模型提供的）与引用完整性（SQL提供的记录归档系统可能具有）明显不同。如果你不明白这一点，请打开一个新的问题有什么区别...并且打我。} 
 
 
  ^{2。打破任何规则总是有不良后果，超出了行为本身。} 
 
 
  ^{3。如果您不明白这一点，请打开一个新的问题什么是关系导航...并且ping我。} 
 
 
 所以最后的答案您的问题：
 
  上述所有元素都处理关键主键，但都是自然键 - 所以我们可以将它们留作是或用替代键替换。 
 
 
在概念和逻辑练习中，我们仅处理逻辑键。诸如代理人之类的物理概念是非法的。在逻辑练习中用物理生物替换逻辑键被拒绝。使用您所拥有的密钥，来自数据，并且是自然的。
 
 
 不是替换
 
 
 还有一点。术语替换不正确。一个代理人永远不是一个自然键的替换或替代。
 
 
  
  自然键提供的许多品质之一是行唯一性，而且在关系模型中也是需要重复的行。
 
 
  由于代理不是一行的键（它是一个物理指针到记录），它不能提供所需的行唯一性。如果您不完全明白我在说什么，请从顶部阅读 此答案 到虚假教师。测试给定的代码练习。
 
 
  因此，在物理建模阶段，即使考虑到代理，总是一个附加列和索引。这不是一个自然关系密钥的替代。
 
 
  相反，如果代理 实现为替换，则后果是重复的行，非关系文件，而不是关系表。
 
 
 
 
 
 案例1 
 
 
 
 
  键属性是一个名称，所以它必须是CHAR或VARCHAR类型。一般名称成为主要属性。 
 
 
是的。 
 
 
 他们通常是代码（用户使用代码）。通常，代码跳出你（你的更多更新中有一个非常好的例子）。 {D | R | B}也可以{{lt; | ^ | >}。这当然是在逻辑模型阶段的结束时，当模型是稳定的，并且正在完成键并优化它们。 
 
 
 这个想法是让它有意义的。
 
 
  
  键有意义（代理没有意义）。关系密钥的一个优点是，无论密钥作为外键迁移到哪里，这个含义都会被传递。
 
 
  根据你的例子，无论使用哪里包括程序代码。写作：
  IF CrewType =备份 - 有意义但修复值
 IF CrewType = 1  - 无意义
  
 
 
 
 
 
 只是简单的错误。因为（a）这不是一个Key，（b）用户可能会将该数据的值从备份更改为 code>等等。永远不要写代码来解决数据值，一个描述符。所以事实是，备份是Key的投影，exposition和代码是Key。这解决了CrewType.Name，Key是CrewTypeCode。
  IF CrewTypeCode =B - 键，有意义，不固定
  
当我们在钥匙时，请注意：
 
  在关系模型中，我们有主键，备用键和外键（迁移的主键）。
 
 
  我们没有候选键，在 RM 中没有定义这样的东西。这是在 RM 之外制造的东西。因此，它是非关系的。 
 
 
 更糟糕的是，它们被实施代理人的人用作主键^a。
 
 
  物理考虑^b，但应该在整个练习中被理解和应用。当数据被理解和知道时，列将是固定的长度。当它们不知道时，它们可能是可变的。对于密钥，鉴于它们将被索引，至少在主端上，它们不应该是可变的，因为这需要在每次访问时解包。
 
 
 
 ^{a。使用SQL关键字 PRIMARY KEY 不会将代理转换为PK。如果遵循 RM ，则（a）确定可能的密钥（无替代），然后（b）选择一个作为主要，其中（c）表示选举结束，因此（d）提名的候选人不能再被称为候选人，事件是历史，因此（e）其余的非主键是备用键。} 
 
 
  ^{候选键是拒绝符合 RM 并提名PK，因此本身就是非关系的。独立于一个事实，即它们有一个代理作为主键，这是第二个非关系项目。} 
 
 
  ^{b。对于那些认为没有技术知识和远见的非技术人员，根本没有任何物理上的考虑，应该在逻辑上进行评估，很好，在身体上进行评估。因为我没有在这里处理身体，我只是为Umbra做了一个笔记。} 
 
 
 魔术师依靠他们的技巧，让兔子兔子看起来像狮子。科学家不需要它们。
 
 
 案例2 
 
 
 
 
  或更多）识别关系成为关系的复合主键（由外键组成）。 
 
 
 I认为你有正确的想法，但通用情况下的措辞是不正确的。
 
 
  
  这个措辞对于关联表，它有两个外键。是的，在这种情况下，两个FK形成PK，这是行唯一性所需要的。没有什么可以更好的。添加记录ID是多余的。
 
 
  对于通用情况，对于任何表：
 
 
  
   识别关系 ¹导致FK（迁移的父PK）成为小孩中的PK的一部分。因此，名称，父母标识孩子。
 
 
  这使得孩子成为依赖 ¹表，这意味着子行只能在父行的上下文中存在。这些表格形成数据层次结构中的中间和叶节点，它们是关系数据库中的大多数表。
 
 
  如果行可以独立存在，那么表格是独立 ¹。这样的表形成每个数据层次结构的顶部，在关系数据库中很少。
 
 
   A 非识别关系 sup> 1 是一个FK（迁移的父PK），不用于形成孩子PK。
 
 
  复合或复合键是只是由多列组成，它们是关系数据库中的标准票价。除每个数据层次结构顶部之外的每个表都将具有复合键。如果没有，数据库不是关系。
 
 
 
 
 
 
 
 
 请阅读我的  IDEF1X简介 仔细。
 
 
  ^{1。 理论家不区分识别与非识别，或依赖与独立：他们的所有文件都是独立的;记录指针之间的所有关系都是非识别的。这是1970年代前ISAM记录系统的回归，没有关系完整性，权力和速度。这就是他们所能理解的一切，就是他们所能教的一切。欺骗性地标记为关系。} 
 
 
 案例3 
 
 
 
 
 识别与弱键属性的关系也成为复合主键。 
 
 
在关系模型中未定义具有或不与键关系的术语弱。这是理论家的小说。因此我无法回答这个问题。
 
 
  
  我注意到一些理论论文提出了强大的关键描述了以前已经建立的密钥）为弱，弱的密钥（正常的英文单词，描述了关键以前没有确定的事实）为强。这是精神分裂症的性质。
 
 
  因此，我怀疑它是证明将科学与非科学混淆的企图的一部分，并破坏关系模型。在过去，当这些人被关上时，人性健康。现在他们在大学写书，教书。
 
 
 
 
 
 案例4 
 
 
 
 
   Associative entities usually have two or more Identifying Relationships
 
 
Yes . Two is correct.
 
  
 If you have more than two, then that is not fully Normalised. Codd gives an explicit method to Normalise that, such that there will be two (or more) Associative entities, of two exactly Identifying relationships each.
 
  
  
 \"... therefore, all n-ary (more than two) relations ... can be ... and should be, resolved to binary (two) relations.\"
 
 (paraphrased for this context)
 
 
 
  
 
 
   so they are to be Junction Relations (Junction Tables).
 
 
No. \"Junction\" relations and \"junction\" tables are not defined in the Relational Model, therefore they are non-relational. 
 
  
 Associative Entities in the logical become Associative Tables in the physical.
 
  
 Answer Too Long
 
  
 The completion of the answer exceeded the limit for SO answers. Therefore I have placed the Answer in a single document, and provided a link. Splitting the Answer at this point proved to be a sin, thus the document contains the entire answer, with consistent formatting, etc:
 
  
 Complete Answer
 
  
  
 To continue from this point (ie. the SO Answer text, above), simply scroll down to the Case 4 heading.
 
 There is a value in retaining the above SO Answer text, not only for historical purposes, but for text searches, etc.
 
 
 
I wish to know how to correctly set up Primary Keys in a Relation. E.g. we have ER-diagram which contain elements:


Key attributes
Weak key attributes
Identifying relationships
Associative entities


In order to translate it into Relational Model we should do some tricks. All elements above deal with Primary Keys of relations but they all are Natural Keys - so we can leave them as is or replace with Surrogate Keys.

Consider some cases.  

Case 1

Key Attribute is a name - so it must be of type CHAR or VARCHAR. Generally names become Key Attributes.

Case 2

Two (or more) Identifying Relationships become a Composite Primary Key of a relation (which is made of Foreign Keys).

Case 3

Identifying Relationship(s) with Weak Key Attribute(s) also become a Composite Primary Key.

Case 4

Associative entities usually have two or more Identifying Relationships so they are to be Junction Relations (Junction Tables).  


How to set up primary keys for Relations in order to handle all above cases (perhaps some more cases which I did not mention)?
How to avoid using surrogate keys and in which cases are they necessary?
How to set up datatypes for primary keys?
If a composite primary key has to be passed into child relation, shall it be replaced with a surrogate?


Advantages and disadvantages of using surrogate keys in my view:

Advantages  


They're compact (usually of type INT) and are sometimes good replacement for Composite Keys
They're illustrative when they're in Foreign Keys
They're painlessly indexed


Disadvantages


They're numbers and meaningless. E.g. I wish to fill up Junction Table in my Interface Application - so I will be left no other choice but to relate just numbers
They're redundant
They're confusing


As for setting up datatypes - there must be more tricks as well as setting up primary keys as whole.

Update

I should have given an example initially, but I did not. So here's an example.
Consider we have two main entities which interact with each other (still don't know how to illustrate such things as diagrams here - so I'll show them as tables which are to demonstrate International Space Station crew rotation system):

SpaceShip
╔════════════════╤════════════════╗
║ ShipName       │ ShipType       ║ ShipName - Primary Key
╟────────────────┼────────────────╢ ShipType - Foreign Key (but it is
║ Soyuz TMA-14   │ Soyuz          ║   not being considered here)
║ Endeavour      │ Space Shuttle  ║
║ Soyuz TMA-15M  │ Soyuz          ║
║ Atlantis       │ Space Shuttle  ║
║ Soyuz TM-31    │ Soyuz          ║
║ ...            │ ...            ║
╚════════════════╧════════════════╝
And the Crew
╔════════╤══════════╗
║ CrewId │ SallSign ║ CrewId - Primary Key (used Id 'case crew is usually
╟────────┼──────────╢   shown as crew members - it has no particular
║ 4243   │ Astreus  ║   name)
║ 4344   │ Altair   ║ CallSign - attribute (it may not be assigned or
║ 4445   │ ...      ║   explicitly shown - i.e. it can be NULL)
║ ...    │ ...      ║
╚════════╧══════════╝
These two entities interact via Flight. Each flight delivers to the ISS one crew and returns another or the same crew. Obviously relationship between the Flight and Crew is many-to-many and it needs junction relation (table). But we can not just relate the SpaceShip and the Crew because of spaceships - spaceship can be reusable (returnable) such as Space Shuttles were.

So the Flight should look like:
╔═══════════════╤════════════╤═══════════════╤═════╗
║ ShipName      │ FlightName │ ShipFlightNum │ ... ║ ShipName, FlightName
╟───────────────┼────────────┼───────────────┼─────╢   are composite PK
║ Soyuz TM-31   │ NULL       │ 1             │ ... ║ ShipFlightNum
║ Atlantis      │ STS-117    │ 28            │ ... ║   depends on whole
║ Soyuz TMA-14  │ NULL       │ 1             │ ... ║   Composite PK
║ Endeavour     │ STS-126    │ 22            │ ... ║ ... - other
║ Soyuz TMA-15M │ NULL       │ 1             │ ... ║   attributes which
║ Endeavour     │ STS-111    │ 18            │ ... ║   depend on PK
║ Atlantis      │ STS-122    │ 29            │ ... ║
║ ...           │ ...        │ ...           │ ... ║
╚═══════════════╧════════════╧═══════════════╧═════╝
So Flight has Composite Primary Key (flight name for Soyuz vehicle the same as the spacecraft's name but it differs for reusable spacecrafts such as Space Shuttle) and it needs to be related with Crew as many-to-many. Here is the part of my complex question - if this composite Primary Natural Key should be replaced with Surrogate one?

And if we're going to work with Natural Keys further then new Junction Relation (Associative Entity) should look like:

Designation (Crew is Designed to the Flight)
╔═══════════════╤════════════╤════════╤══════════╗
║ ShipName      │ FlightName │ CrewId │ CrewType ║
╟───────────────┼────────────┼────────┼──────────╢
║ Soyuz TMA-15M │ NULL       │ 4243   │ Deliver  ║
║ Soyuz TMA-15M │ NULL       │ 4243   │ Return   ║
║ Soyuz TMA-15M │ NULL       │ 4445   │ Backup   ║
║ Soyuz TMA-16M │ NULL       │ 4344   │ Deliver  ║
║ Soyuz TMA-17M │ NULL       │ 4445   │ Deliver  ║
║ Soyuz TMA-18M │ NULL       │ 4344   │ Return   ║
║ Endeavour     │ STS-111    │ 55     │ Deliver  ║
║ Endeavour     │ STS-111    │ 44     │ Return   ║
║ Endeavour     │ STS-113    │ 55     │ Return   ║
║ ...           │ ...        │ ...    │ ...      ║
╚═══════════════╧════════════╧════════╧══════════╝
Here we have 4x Composite Primary Key which is made up of four Foreign Keys (CrewType also have FK constraint). If we use Surrogates instead of Naturals then result will be more compact but hard to fill up (in my view).  

One more update

Another case for table (relation) TypeCrew:
╔══════════╗
║ CrewType ║
╟──────────╢
║ Deliver  ║
║ Return   ║
║ Backup   ║
║ ...      ║
╚══════════╝
Everyhing would be fine if only we had not to use these values in our queries (WHERE CrewType LIKE 'Backup'). If these values will be replaced with alternative meanings in other languages or even with symbols e.g. >, < and ^ for Deliver, Return and Backup respectively (WHERE CrewType LIKE '^'). Adding numerical Surrogate Key will not help much as its values may mismatch with TypeName (WHERE TypeId=2):
╔════════╤══════════╗    ╔════════╤══════════╗    ╔════════╤══════════╗
║ TypeId │ TypeName ║    ║ TypeId │ TypeName ║    ║ TypeId │ TypeName ║
╟────────┼──────────╢    ╟────────┼──────────╢    ╟────────┼──────────╢
║ 0      │ Deliver  ║    ║ 0      │ Backup   ║    ║ 0      │ >        ║
║ 1      │ Return   ║    ║ 1      │ Deliver  ║    ║ 1      │ <        ║
║ 2      │ Backup   ║    ║ 2      │ Return   ║    ║ 2      │ ^        ║
║ ...    │ ...      ║    ║ ...    │ ...      ║    ║ ...    │ ...      ║
╚════════╧══════════╝    ╚════════╧══════════╝    ╚════════╧══════════╝
Perhaps this is not a question of Relational Model? Perhaps it's just bad design? But I could not devise better.
 解决方案 
Position

Any practice that is not based on solid theory is not worthy of consideration.  I am a strict Relational Model practitioner, with a strong grounding in the theory.  The Relational Model is based on solid theory, and has never been refuted¹.  There is nothing solid in what passes for "relational theory", I have taken them on, and refuted their notions in their space.  Further, Relational Database design is a science, not magic, not art², therefore I can provide evidence for any of the propositions or charges that I make. My answers are from that position.

^{1. The are non-science articles, and masses of opinions from those who do not understand the science, yes, but no scientific refutation.  Much like pygmies arguing that man cannot fly, it is "true" for them, but not true for mankind, it is based on a complete inability to understand the principle of flight.}

^{2. There is some art in the presentations of high-end practitioners, yes, but that does not make the science an art.  It is a science, and only a science, and over and above that, it can be artfully delivered, in models and databases.}

"Relational Theory"


  I wish to know how to correctly set up Primary Keys in a Relation. E.g. we have ER-diagram which contain elements:
If it was an ERD, then you wouldn't be looking at "relations", you would be looking at entities (if the diagram was early) or tables (if it were progressed).  "Relations" are a wonderful abstraction which have nothing to do with an implementation.  An ERD or a Data Model means an implementation (non-abstract, real) is intended, the intention to the physical leaves the abstract world of theory behind, and enters the physical world, where idiotic abstractions get destroyed.  

Further the "theoreticians" who allege to be serving the database space cannot differentiate between base relations and derived relations: while that might be acceptable in the abstract context, it is dead wrong in the implementation context.  Eg. base relations are tables, and they need to be Normalised; derived relations are, well, derived, views, of base relations, which by definition are flattened views (not "denormalised", which means something slightly different) of base relations.  As such, they need not be Normalised. 


But the "theoreticians" try to "normalise" derived relations.  And the most damaged two are trying to have the definition of 1NF, that we have had for forty five years, that is fundamental and rock solid, that they themselves have supported, changed, so that their derived relations, which do not need "normalisation", can be classified as "normalised".  It would be hilarious if it were not so sad.


One marvellous quality of objective truth, of science, is that it does not change. Subjective "truth", non-science, changes all the time.  One can be relied upon, it must be understood before a practice is undertaken, the other is not worth reading about.

Isolation

They live in a world of their own, isolated from the reality of Relational Databases, specifically the Relational Model, and the industry that they allege to serve.  In forty five years since the RM came out, they have done nothing to progress the RM or Relational databases. 


Mind you, they have been progressing all sorts of notions, which are outside the Relational Model. 
The progress of the RM (completion of what the Neanderthals suggest was "incomplete") has happened solely due to the standardisation (R Brown and others working with Codd, resulting in the IDEF1X Standard for Modelling Relational Databases), and the efforts of high-end SQL vendors and their customers. 
That is the  commercial RDBMS vendors, who were already established in the 1980's, not the Non-sql freeware/shareware/vapourware groups of the last decade, who pass off their wares as "sql", which gets you good and glued to their "platform", non-portable.


The worst part is, they publish books about their non-relational concepts, and fraudulently label them as "relational".  And "professors" blindly "teach" this nonsense, like parrots, without ever understanding either the nonsense, or the Relational Model that it is supposed to explore.


If you are trying to find answers to some "educational" project, sorry, I cannot provide that, because the "education", as you can see, is totally confused, and has non-relational requirements.
I can however, provide direct answers to the question, governed by science, the Relational Model, the laws of physics, etc.


The point to take from this is, while Relational Theory and Practice were very close after Dr E F Codd published his seminal work, and during the time that the SQL Platforms were developed by the vendors, in the post-Codd era, what passes for "relational theory" is completely divorced from that original Relational Theory. 


I can enumerate the differences, but not here.  Note that if you read my posts that touch on this subject, you can gather those particulars, and enumerate them yourself.  Or else ask a new question.


The Question


  I wish to know how to correctly set up Primary Keys in a Relation. E.g. we have ER-diagram which contain elements:
There is no ERD to examine.  Ok, in the Update you have an example.  Perfect for your questions, because it is a set of user views of the data, and the modelling can now begin.  But note, that is not an ERD or a Model.  We rely on understanding the data; analysing it; classifying it, not on looking at the data values with a microscope.  I realise that that is what you have been taught to do.

  In order to translate it into Relational Model
Yes, that is the stated goal.  The word "translate" is incorrect, because the RM is not merely a flat or fixed set of criteria that one "satisfies" or fits into (as it is known to the "theoreticians"), it also provides specific Methods and Rules.  Therefore, we will be Modelling, according to the Relational Model.

  we should do some tricks.
We don't need tricks, we use science, and only science.  The "theoreticians" and the "professors" who follow them, need tricks, and practice non-science.  I can't help in that regard.  Further, the tricks they use, are usually to circumvent and subvert the Relational Model, so watch out for them.

Surrogate


  All elements above deal with Primary Keys of relations but they all are Natural Keys - so we can leave them as is or replace with Surrogate Keys.
Well, there it is, your "teacher's" first trick is exposed.

Surrogates are physical Record (not row) pointers, they are not logical.
There is no such thing as a "surrogate key", the two words contradict each other. 


A Key has a specific definition in the RM, it has to be made up from the data.  A surrogate isn't made up from the data, it is manufactured, a meaningless number generated by the system.  Therefore it is not a Key or a "key".  
A Key in the RM has has a number of Relational qualities, which makes Keys very powerful.  Since a surrogate is not a Key, it does not have any of those qualities, it has no Relational power.
Therefore, "surrogate" and Key each have specific meanings, and they are quite fine as separate terms, but together, they are self-contradictory, because they are opposites.
When people use them term "surrogate key", they naturally expect some, if not all, the qualities of a Key.  But they will not obtain any of them.  Therefore they are defrauded.

The Relational Model (the one that the theoreticians know nothing about) has a specific Access Path Independence Rule.  As long as Relational Keys are used, this rule is maintained.  It provides Relational Integrity¹. 


The use of a surrogate violates this rule.  The consequence² is, Relational Integrity and Relational Navigation³ are both lost.  
The consequence of that is, many more joins are required to get at the same data (not less, as the lovers of mythology and magic keep parroting).
Therefore surrogates are not permitted, on another, separate count.

Since you are in the modelling stage, either conceptual or logical, and Keys are Logical, and surrogates are physical, surrogates should not come into the picture.  (They come into the picture, if at all, for consideration, only when the logical model is complete, and the physical model is being considered.)  You are nowhere near completion of the Logical, so the introduction of a surrogate should raise a red flag.  

The "teacher", and the author of the "textbook" that he is using, are frauds, on two separate counts: 


They are introducing a physical field, into the Logical exercise, which should not concern itself with physical aspects of the database. 
But in so doing, the effect they have is that they establish the surrogate, the physical thing, as a logical thing.  Thus they poison the mind.

There, straight science, pure logic, uncontaminated by insane thinking, and thus immune to the frauds.  No surrogates at the Logical stage.

^{1. Relational Integrity (which the Relational Model provides) is distinctly different to Referential Integrity (which SQL provides, and Record Filing Systems might have).  If you do not understand this, please open a new question "What is the difference ..." and ping me.}

^{2. Breaking any rule has always has undesirable consequences, beyond the act itself.}

^{3. If you do not understand this, please open a new question "What is the Relational Navigation ..." and ping me.}

So the final answer to your question:

  All elements above deal with Primary Keys of relations but they all are Natural Keys - so we can leave them as is or replace with Surrogate Keys.
In the conceptual and logical exercise, we deal with Logical Keys only.  Physical concepts such as a surrogate are illegal.  The replacement of a Logical Key with a physical creature, in the Logical exercise is rejected.  Use the Keys you have, which are from the data, and natural.

Not a "Replacement"

There is one more point.  The term "replacement" is incorrect.  A surrogate is never a replacement or substitute for a Natural Key.


One of the many qualities that a natural Key provides, is row uniqueness, and that too, is demanded in the Relational Model, duplicate rows are not permitted.
Since a surrogate is not a Key to a row (it is a physical pointer to a record), it cannot provide the required row uniqueness.  If you do not fully understand what I am saying, please read this Answer, from the top to False Teachers.  Do test the given code exercises.
Therefore, a surrogate, even if considered, at the physical modelling stage, is always an additional column and index.  It is not a replacement for a natural Relational Key.
And conversely, if the surrogate is implemented as a replacement, the consequence is duplicate rows, a non-relational file, not a Relational table.


Case 1


  Key Attribute is a name - so it must be of type CHAR or VARCHAR. Generally names become Key Attributes.
Yes.  

Often they are codes (users do use codes).  Often Codes jump out at you (you have a very good example in your One More Update).  { D | R | B } would do just as well { < | ^ | > }.  This is of course towards the end of the logical model stage, when the model is stable, and one is finalising the Keys and optimising them.  For any stage earlier than that, the wide Natural Keys stand.

The idea is to keep it meaningful.


Keys have meaning (surrogates have no meaning).  One of the qualities of a Relational Key is, that that meaning is carried, wherever the Key is migrated as a Foreign Key.
And as per your example, wherever it is used.  Including program code.  Writing:
 IF CrewType = "Backup"  -- meaningful but fixes a value
 IF CrewType = 1         -- meaningless



is just plain wrong.  Because (a) that is not really a Key, and (b) the user may well change the value of that datum from Backup to Reserve, etc.  Never write code that addresses a data value, a descriptor.  So the fact is, Backup is the projection of the Key, the exposition, and the code is the Key.  That resolves to CrewType.Name, and the Key is CrewTypeCode.
     IF CrewTypeCode = "B"   -- Key, meaningful, not fixed
While we are on Keys, please note:

In the Relational Model, we have Primary Keys, Alternate Keys, and Foreign Keys (migrated Primary Keys).
We do not have "candidate keys", no such thing is defined in the RM.  It is something manufactured outside the RM.  It is therefore non-relational.  

Worse, they are used by people who implement surrogates as "primary keys"^a.
A physical consideration ^b, but one that should be understood and applied throughout the exercise.  When the data is understood and known, the columns will be fixed length.  When they are unknown, they might be variable.  For Keys, given that they will be indexed, at least on the Primary side, they should never be variable, because that requires unpacking on every access.
^{a. The use the SQL keyword PRIMARY KEY does not magically transform a surrogate into a PK.  If one follows the RM, one (a) determines the possible Keys (no surrogates), and then (b) chooses one as Primary, which (c) means the election is over, therefore (d) the nominated candidates can no longer be called "candidates", the event is history, therefore (e) the remainder, the non-primary Keys, are Alternate Keys.}

^{"Candidate key" is a refusal to conform to the RM and nominate a PK, therefore, in and of itself, it is non-relational.  Separate to the fact that they have a surrogate as "primary key", which is a second non-relational item.}

^{b.  For those non-technical people who believe that no technical knowledge and foresight, no physical considerations at all, should be evaluated during the logical, that's fine, evaluate them at the physical. Since I am not addressing the physical here, I am just making a note for Umbra.}

Magicians rely on their tricks, to make bunny rabbits look like lions.  Scientists do not need them.

Case 2


  Two (or more) Identifying Relationships become a Composite Primary Key of a relation (which is made of Foreign Keys).
I think you have the right idea, but the wording is incorrect for the generic case.


That wording is correct for an Associative Table, which has two Foreign Keys.  Yes, in that case, the two FKs form the PK, which is all that is needed for row uniqueness.  Nothing can better that.  The addition of a Record ID is superfluous.
For the generic case, for any table:


An Identifying Relationship¹ causes the FK (migrated parent PK) to be part of the PK in the child.  Hence the name, the parent Identifies the child.
That makes the child a Dependent¹ table, meaning that the child rows can exist only in the context of a parent row.  Such tables form the intermediate and leaf nodes in the Data Hierarchies, they are the majority of tables in a Relational database.
If the row can exist independently, the table is Independent¹.  Such tables form the top of each Data Hierarchy, there are very few in a Relational database.
A Non-identifying Relationship¹ is one where the FK (migrated parent PK), is not used to form the child PK.
Compound or Composite Keys are simply made up of more than one column, they are standard fare in Relational databases.  Every table except the top of each Data Hierarchy will have a Compound Key.  If you do not have any, the database is not Relational.



Please read my IDEF1X Introduction carefully.

^{1. The "theoreticians" do not differentiate Identifying vs Non-identifying, or Dependent vs Independent: all their files are Independent; all their "relationships" between record pointers are Non-identifying.  It is a regression to the pre-1970's ISAM Record Filing Systems, devoid of Relational Integrity, power, and speed.  That is all they understand, that is all they can teach.  Fraudulently labelled as "relational".}

Case 3


  Identifying Relationship(s) with Weak Key Attribute(s) also become a Composite Primary Key.
The term "weak" with or without a relationship to "key" is not defined in the Relational Model.  It is a fiction of the "theoreticians".  Thus I cannot answer that question.


I do note that some of the "theoretical" papers present strong Keys (normal English word, describing the fact that the Key has been established previously) as "weak", and weak "keys" (normal English word, describing the fact that the "key" has not been established previously) as "strong".  Such is the nature of schizophrenia.
Therefore I suspect that it is part and parcel of their evidenced attempt to confuse the science with non-science, and to undermine the Relational Model.  In the old days, when such people were locked up, humanity was healthly.  Now they write books and teach in colleges.


Case 4


  Associative entities usually have two or more Identifying Relationships
Yes.  Two is correct.

If you have more than two, then that is not fully Normalised.  Codd gives an explicit method to Normalise that, such that there will be two (or more) Associative entities, of two exactly Identifying relationships each.


"... therefore, all n-ary (more than two) relations ... can be ... and should be, resolved to binary (two) relations."

(paraphrased for this context)



  so they are to be Junction Relations (Junction Tables).
No.  "Junction" relations and "junction" tables are not defined in the Relational Model, therefore they are non-relational.  

Associative Entities in the logical become Associative Tables in the physical.

Answer Too Long

The completion of the answer exceeded the limit for SO answers.  Therefore I have placed the Answer in a single document, and provided a link.  Splitting the Answer at this point proved to be a sin, thus the document contains the entire answer, with consistent formatting, etc:

Complete Answer


To continue from this point (ie. the SO Answer text, above), simply scroll down to the Case 4 heading.
There is a value in retaining the above SO Answer text, not only for historical purposes, but for text searches, etc.


                        这篇关于如何在关系中设置主键？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

如何在关系中设置主键？ [英] How to set up Primary Keys in a Relation?

问题描述

案例1

案例2

案例3

案例4

优点

缺点

更新

另一个更新

位置

关系理论

隔离

问题

代理

不是替换

案例1

案例2

案例3

案例4

Answer Too Long

Case 1

Case 2

Case 3

Case 4

Advantages

Disadvantages

Update

One more update

Position

"Relational Theory"

Isolation

The Question

Surrogate

Not a "Replacement"

Case 1

Case 2

Case 3

Case 4

Answer Too Long

相关文章

其他数据库最新文章

热门教程

热门工具

登录 关闭

登录关闭