如何引用关系数据库中的记录组? [英] How to reference groups of records in relational databases?

查看:93
本文介绍了如何引用关系数据库中的记录组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们具有以下表结构:

Suppose we have the following table structures:

人类


| HumanID | FirstName | LastName | Gender |
|---------+-----------+----------+--------|
|       1 | Issac     | Newton   | M      |
|       2 | Marie     | Curie    | F      |
|       3 | Tim       | Duncan   | M      |

动物


| AmimalID | Species | NickName |
|----------+---------+----------|
|        4 | Tiger   | Ronnie   |
|        5 | Dog     | Snoopy   |
|        6 | Dog     | Bear     |
|        7 | Cat     | Sleepy   |

我想知道如何引用其他表中的一组记录。

I wonder how to reference a group of records in other tables.

例如,我有一个Foods表

For example, I have a Foods table and an EatenBy column.

Foods


| FoodID | FoodName | EatenBy |
|--------+----------+---------|
|      8 | Rice     | ???     |

我想在EatenBy中存储的东西可能是

What I want to store in EatenBy may be


  1. 一个记录在人,动物表中(例如Tim Ducan)

  2. 表中的一组记录(例如所有狗,所有雄性,所有女性)

  3. 整张桌子(例如所有人)

  1. a single record in the Humans, Animals tables (e.g., Tim Ducan)
  2. a group of records in a table (e.g. all dogs, all males, all females)
  3. a whole table (all humans, for example)

一个简单的解决方案是使用级联字符串,其中包括来自不同表的主键
和特殊字符串,例如'Humans','M'。
应用程序可以解析连接的字符串并做相应的事情。

A simple solution is to use a concatenated string, which includes primary keys from different tables and special strings such as 'Humans', 'M'. The application could parse the concatenated string and do things accordingly.

Foods


| FoodID | FoodName | EatenBy      |
|--------+----------+--------------|
|      8 | Rice     | Humans, 6, 7 |


关系数据库设计的角度来看,我知道使用串联字符串是一个坏主意。

I know using a concatenated string is a bad idea from the perspective of relational database design.

另一种选择是添加另一个表并使用外键

Another option is to add another table and use a foreign key.

食品


| FoodID | FoodName |
|--------+----------|
|      8 | Rice     |

EatenBy


| FoodID | EatenBy |
|--------+---------|
|      8 |  Humans |
|      8 |       6 |
|      8 |       7 |

我认为它比第一个解决方案要好。问题在于EatenBy字段存储了不同含义的值。

I think it's better than the first solution. The problem is that the EatenBy field stores values of different meanings. Is that a problem?

为该需求建模的最佳方法是什么?在这种情况下如何实现3NF?

What is the best way to model this requirement? How to achieve 3NF in this case?

此处的示例表有些人为设计,但我确实遇到了像
这样的情况。我已经看到很多表只使用连接字符串。我认为这很糟糕,但是却想不出更关系的处理方式。

The example tables here are a bit contrived, but I do run into situations like this at work. I have seen quite a few tables just use a concatenated string. I think it is bad but can't think of a more relational way to deal with it.

对于那些想快速知道此问题答案的人。基本概念如下:

For those who want to know the answer for this problem quickly. Here're the basic ideas:


  1. 关系模型中,不应使用连接的字符串,CSV和重复组。因此,我们需要规范化表,消除重复的组。

  2. 在此问题中,串联的字符串包含具有不同含义的值。

    关系数据库存储有关现实世界的事实。每个表应存储有关一个主题的事实,每列应存储一个事实

    因此,我们需要一个关联表来修复每个事实。 EatenBy。

    例如

    Food_Human(FoodID,HumanID)

    Food_Animal(FoodID,AnimalID)

    Food_Species(FoodID,SpeciesID)

  1. Concatenated strings, CSVs, repeating groups should not be used in the Relational Model. So we need to normalize the table, eliminating repeating groups.
  2. In this problem, the concatenated string contains values of different meanings.
    Relational databases store facts about the real world. Each table should store facts about ONE subject and each column should store ONE fact.
    Thus, we need one associative table for each fact to fix "EatenBy".
    e.g.
    Food_Human (FoodID, HumanID)
    Food_Animal (FoodID, AnimalID)
    Food_Species (FoodID, SpeciesID)


Caveat


解决问题的想法很好。但是,这里的数据模型很糟糕。

Caveat

The ideas to solve the problem is good. However, the data model here is awful.

明显的问题:


  1. 人类是动物

  2. 在数据建模过程中使用代理人(ID)是不好的做法

  1. Humans are Animals
  2. Using surrogates (IDs) during data modeling is bad practice


推荐答案


  1. 此答案按时间顺序排列。该问题的详细程度有所提高,记为更新。有一系列匹配的响应

  1. This Answer is laid out in chronological order. The Question progressed in terms of detail, noted as Updates. There is a series of matching Responses.

从最初的问题到最终的答案是一种学习经验,特别是对于OO / ORM类型。大标题标记回答,小标题标记主题。

The progression from the initial question to the final answer stands as a learning experience, especially for OO/ORM types. Major headings mark Responses, minor headings mark subjects.

答案超出了最大长度。我将它们作为链接来解决。

The Answer exceeds the maximum length exceeded. I provide them as links in order to overcome that.



对初始问题的答复



您可能已经在工作中看到类似的东西,但这并不意味着它是正确的或可以接受的。 CSV破坏2NF。您无法轻松搜索该字段。您无法轻松更新该字段。您必须通过代码手动管理内容(例如避免重复;订购)。您没有数据库,也没有类似的数据库,但是您有一个庞大的Record Filing System,您必须编写大量代码来处理。就像1970年ISAM数据处理的过去一样糟糕。

Response to Initial Question

You might have seen something like that at work, but that doesn't mean it was right, or acceptable. CSVs break 2NF. You can't search that field easily. You can't update that field easily. You have to manage the content (eg. avoid duplicates; ordering) manually, via code. You don't have a database or anything resembling one, you have a grand Record Filing System that you have to write mountains of code to "process". Just like the bad old days of the 1970's ISAM data processing.


  1. 问题是,您似乎想要一个关系数据库。也许您已经听说过数据完整性,关系能力(在此阶段为您提供连接能力)和速度。记录归档系统没有这些。

  1. The problem is, that you seem to want a relational database. Perhaps you have heard of the data integrity, the relational power (Join power for you, at this stage), and speed. A Record Filing System has none of that.

如果要使用关系数据库,则必须:

If you want a Relational database, then you are going to have to:


  • 以关系方式考虑数据,并应用关系数据库方法,例如将数据建模为数据,仅将数据建模(不作为数据值)。

  • think about the data relationally, and apply Relational Database Methods, such as modelling the data, as data, and nothing but data (not as data values).

然后对数据分类(与OO类或分类器概念无关)。

Then classifying the data (no relation whatever to the OO class or classifier concept).

然后关联分类数据。

第二个问题是,这是OO类型的典型特征,它们专注于,迷恋数据的,而不是数据的含义;如何分类;它与其他数据有何关系;等等。

The second problem is, and this is typical of OO types, they concentrate on, obsess on, the data values, rather than on the meaning of the data; how it is classified; how it relates to other data; etc.

没问题,您没有想到这个概念,您的老师把它喂给了您,我一直都在看。他们喜欢记录归档系统。注意,不是声明表定义,而是声明给出结构,而是列出数据值。

No question, you did not think that concept up yourself, your "teachers" fed it to you, I see it all the time. And they love the Record Filing Systems. Notice, instead of giving table definitions, you state that you give "structure", but instead you list data values.


  • 在如果您不理解我在说什么,让我向您保证,这是面向对象世界中的一个经典问题,如果您应用这些原理,则解决方案很容易。否则,这将是OO堆栈中无尽的混乱。最近,我完全消除了一个由OO整体支持的著名数学家提出的OO提议+解决方案。这是著名的论文。

  • In case you don't appreciate what I am saying, let me assure you that this is a classic problem in the OO world, and the solution is easy, if you apply the principles. Otherwise it is an endless mess in the OO stack. Recently I completely eliminated an OO proposal + solution that a very well known mathematician, who supports the OO monolith, proposed. It is a famous paper.

关系化了数据(即,我只是将数据放置在关系上下文中:对其进行建模和规范化,这花了一个总计十分钟),问题消失了,不需要提案+解决方案。阅读 藏匿者回应 。请注意,我不是在试图破坏论文,而是在试图理解以精神分裂症形式呈现的数据,而最简单的方法是建立关系数据模型。

I relationalised the data (ie. I simply placed the data in the Relational context: modelled and Normalised it, which took a grand total of ten minutes), and the problem disappeared, the proposal + solution was not required. Read the Hidders Response. Note, I was not attempting to destroy the paper, I was trying to understand the data, which was presented in schizophrenic form, and the easiest way to do that is to erect a Relational data model. That simple act destroyed the paper.

请注意,该链接摘自某客户(一家大型澳大利亚银行)的有偿工作的正式报告的摘录,谁允许我发布摘要,以教育公众,尤其是面向对象的支持者,可能会忽视关系数据库的原则。

Please note that the link is an extract of a formal report of a paid assignment for a customer, a large Australian bank, who has kindly given me permission to publish the extract with a view to educating the public about the dangers of ignoring Relational database principles, especially by OO proponents.

The完全相同的过程发生在第二篇更著名的论文 科勒反应 。这个响应要小得多,不那么正式,它不是为客户付费的工作。那位作者正在推测另一个异常的正常形式。

The exact same process happened with a second, more famous paper Kohler Response. This response is much smaller, less formal, it was not paid work for a customer. That author was theorising about yet another abnormal "normal form".

因此,我想请您:


  • 忘记表结构或定义

  • forget about "table structures" or definitions

忘记您想要的东西

忘记实现选项

完全完全忘记 ID

忘记 EatenBy

根据数据,数据的含义来思考您拥有的 not 作为数据值或示例数据, not 作为您要使用的数据

think about what you have in terms of data, the meaning of the data, not as data values or example data, not as what you want to do with it

关于如何对数据进行分类以及如何对其进行分类。

think about how that data is classified, and how it can be classified.

数据如何与其他数据相关。 (您可能认为您的 EatenBy 就是那样,但事实并非如此,因为数据尚无组织来建立关系。)

how the data relates to other data. (You may think that your EatenBy is that but it isn't, because the data has no organisation yet, to form relationships upon.)

如果我看着我的水晶球,大部分都是黑暗的,但是从我能看到的小斑点光看来,就像你想要的:

If I look at my crystal ball, most of it is dark, but from the little flecks of light that I can see, it looks like you want:


  • 事物

  • Things

事物组

事物与事物组之间的关系

Relationships between Things and ThingGroups

事物是名词,主语。最终,我们将在这些主题之间做一些动作,例如动词或动作陈述。这将形成谓词(一阶逻辑)。但不是现在,就目前而言,我们只想要东西。

The Things are nouns, subjects. Eventually we will be doing something between those subjects, that will be verbs or action statements. That will form Predicates (First Order Logic). But not now, for now, we want the only the Things.

现在,如果您可以修改问题并告诉我更多有关您的东西及其含义的信息,我可以为您提供完整的数据模型。

Now if you can modify your question and tell me more about your Things, and what they mean, I can give you a complete data model.

如果要使用关系数据库,则需要关系键而不是记录ID。此外,在每个文件上都标记一个ID的情况下开始数据建模练习会削弱该练习。

If you want a Relational Database, you need Relational Keys, not Record IDs. Additionally, starting the Data Modelling exercise with an ID stamped on every file cripples the exercise.

请阅读 此答案

如果您想进行完整的论述,请提出一个新问题。这是一个简短的摘要。

If you want a full discourse, please ask a new question. Here is a quick summary.

等级制度是世界上自然发生的,无处不在。这导致在许多数据库中实现了层次结构。 关系模型建立在层次模型的基础上,并且是其扩展。它出色地支持层次结构。不幸的是,著名作家不理解 RM ,他们只教授1970年代之前被标记为关系的记录归档系统。同样,他们不了解层次结构,更不用说 RM 中支持的层次结构,因此他们将其隐藏。

Hierarchies occur naturally in the world, they are everywhere. That results in hierarchies being implemented in many databases. The Relational Model was founded on, and is a progression of, the Hierarchical Model. It supports hierarchies brilliantly. Unfortunately the famous writers do not understand the RM, they teach only pre-1970s Record Filing Systems badged as "relational". Likewise, they do not understand hierarchies, let alone hierarchies as supported in the RM, so they suppress it.

其结果是,必须实现的无处不在的层次结构本身并没有被这样认识,因此,它们的实施是非常错误的,并且是大规模的相反,如果在要建模的数据中出现的层次结构正确建模,并使用真正的关系构造(关系键,规范化等)实现,则为低效率方式。

The result of that is, the hierarchies that are everywhere, that have to be implemented, are not recognised as such, and thus they are implemented in a grossly incorrect and massively inefficient manner.

)的结果是一个易于使用和易于编码的数据库,并且没有数据重复( any 形式)并且非常快。

Conversely, if the hierarchy that occurs in the data that is being modelled, is modelled correctly, and implemented using genuine Relational constructs (Relational Keys, Normalisation, etc) the result is an easy-to-use and easy-to-code database, as well as being devoid of data duplication (in any form) and extremely fast. It is quite literally Relational at its best.

数据中存在三种类型的层次结构。

There are three types of Hierarchies that occur in data.


  1. 在表序列中形成的层次结构

此要求是需要关系键,它会出现在每个数据库中,相反,缺少它会破坏数据库广告,从而产生一个记录归档系统,而没有关系数据库的完整性,功能或速度。

This requirement, the need for Relational Keys, occurs in every database, and conversely, the lack of it cripples the database ad produces a Record Filing System, with none of the integrity, power or speed of a Relational Database.

层次结构以Relational Key的形式清晰可见,它在复合过程中以任何顺序出现:父亲,儿子,孙子等。这对于普通的Relational是必不可少的数据完整性,这是Hidders和95%的数据库实现所不具备的。

The hierarchy is plainly visible in the form of the Relational Key, which progresses in compounding, in any sequence of tables: father, son, grandson, etc. This is essential for ordinary Relational data integrity, the kind that Hidders and 95% of the database implementations do not have.

Hidders Response 有一个很好的层次结构示例:

The Hidders Response has a great example of Hierarchies:

a。在数据中自然存在的

a. that exist naturally in the data

b。 OO类型是看不见的(因为Hidders显然是)

b. that OO types are blind to [as Hidders evidently is]

c。他们没有完全实现RFS,然后尝试在对象层中修复问题,甚至增加了 more 的复杂性。

c. they implement RFS with no integrity, and then they try to "fix" the problem in the object layers, adding even more complexity.

本文以经典的关系形式实现了层次结构,问题完全消失了,从而消除了提议的解决方案。关系化消除了理论。

Whereas I implemented the hierarchy in a classic Relational form, and the problem disappeared entirely, eliminating the proposed "solution", the paper. Relational-isation eliminates theory.

这四个表中的两个层次结构是:

The two hierarchies in those four tables are:

    Domain::Animal::Harvest

    Domain::Activity::Harvest

请注意,Hidders忽略了数据是层次结构的事实;他的RFS不完全是因为没有关系而具有完整性;将数据放在关系上下文中可以提供他正在外部寻求的完整性;关系模型消除了所有这些问题,并使所有这些解决方案变得可笑。

Note that Hidders is ignorant of the fact that the data is an hierarchy; that his RFS doesn't have integrity precisely because it is not Relational; that placing the data in the Relational context provides the very integrity he is seeking outside it; that the Relational Model eliminates all such "problems", and makes all such "solutions" laughable.

这里是 另一个示例 ,尽管建模尚未完成。请确保检查谓词,并在第2页上查看实际的键。层次结构为:

Here's another example, although the modelling is not yet complete. Please make sure to examine the Predicates, and page 2 for the actual Keys. The hierarchies are:

    Subject::CategorySubject::ExaminationResult

    Category::CategorySubject::ExaminationResult

    Person::Registrant::Candidate::ExaminationResult

注意最后一个是业务工具状态的进展,因此密钥不会复合。

Note that last one is a progression of state of the business instrument, thus the Key does not compound.

一个表中的行层次结构

通常是某种树结构,实际上有数以百万计的树结构。对于任何给定的节点,这都支持单个祖先或父级,以及无限个子级。正确完成后,树的数量或树的高度(即无限的祖先和后代)没有任何限制。

Typically a tree structure of some sort, there are literally millions of them. For any given Node, this supports a single ancestor or parent, and unlimited children. Done properly, there is no limit to the number of levels, or the height of the tree (ie. unlimited ancestor and progeny generations).


  • 此处使用的祖先后代术语是简单的技术术语,它们确实没有OO的内涵和局限性。

  • The terms ancestor and descendant use here are plain technical terms, they do not have the OO connotations and limitations.

您确实需要在服务器中进行递归,以便遍历树结构,以便您可以编写递归的简单proc和函数。

You do need recursion in the server, in order to traverse the tree structure, so that you can write simple procs and functions that are recursive.

这里是 消息 。请同时阅读问题和答案,并访问链接的 消息数据模型 。请注意,搜索者没有提到 Herarchy tree ,因为关联关系数据库中的层次知识被抑制了,因此他看到了答案和数据模型,并因其层次结构而意识到了这一点,并且非常适合他。层次结构为:

Here is one for Messages. Please read both the question and the Answer, and visit the linked Message Data Model. Note that the seeker did not mention Hierarchy or tree, because the knowledge of Hierarchies in Relational Databases is suppressed, but (from the comments) once he saw the Answer and the Data Model, he recognised it for the hierarchy that it is, and that it suited him perfectly. The hierarchy is:

    Message::Message[Message]::Message[::Message[Message]] ...


  • 通过关联表在一个表中的行层次结构

    此层次结构为多个祖先或父母提供了祖先/后代结构。它需要两个关系,因此需要一个附加的关联表。这通常称为材料明细表结构。

    This hierarchy provides an ancestor/descendant structure for multiple ancestors or parents. It requires two relationships, therefore an additional Associative Table is required. This is commonly known as the Bill of Materials structure. Unlimited height, recursively traversed.

    物料清单问题是层次DBMS的局限性,我们在网络DBMS中已部分克服了该缺陷。当时这是一个迫在眉睫的问题,也是E F Codd博士明确要求解决的IBM具体问题之一。当然,他实现了这些目标,并超出了目标。

    The Bill of Materials Problem was a limitation of Hierarchical DBMS, that we overcame partially in Network DBMS. It was a burning issue at the time, and one of IBM's specific problems that Dr E F Codd was explicitly charged to overcome. Of course he met those goals, and exceeded them spectacularly.

    这里是物料清单层次结构,正确建模和实施


    • 请原谅,这是一篇文章中的内容,请跳过前两行,再看底部。

    • Please excuse the preamble, it is from an article, skip the top two rows, look at the bottom row.

    Person :: Progeny也被给出。

    Person::Progeny is also given.

    层次结构是:

    Part[Assembly]::Part[Component] ...
    
    Part[Component]::Part[Assembly] ...
    
    Person[Parent]::Person[Child] ...
    
    Person[Child]::Person[Parent] ...
    




  • 无知层次结构



    与以下事实分开:数据中通常存在层次结构,由于抑制,它们不能被这样识别,因此不能将其实现为层次结构,一旦得到认可,便以最可笑,手的方式实施。

    Ignorance Of Hierarchy

    Separate to the fact that hierarchies commonly exist in the data, that they are not recognised as such, due to the suppression, and that therefore they are not implemented as hierarchies, when they are recognised, they are implemented in the most ridiculous, ham-fisted ways.


    • 邻接表

    • Adjacency List

    抑制器嘲笑地说 关系模型不支持层次结构,否认我t建立在层次模型之上(每个模型都提供了明证,即它们不知道 RM 中的基本概念,他们声称这是假定的)。所以他们不能使用这个名字。这是他们使用的愚蠢名称。

    The suppressors hilariously state that "the Relational Model doesn't support hierarchies", in denial that it is founded on the Hierarchical Model (each of which provides plain evidence that they are ignorant of the basic concepts in the RM, which they allege to be postulating about). So they can't use the name. This is the stupid name they use.

    通常,实现会认识到数据中存在层次结构,但是实现会很差,受物理记录ID等的限制,而没有Relational完整性等。

    Generally, the implementation will have recognised that there is an hierarchy in the data, but the implementation will be very poor, limited by physical Record IDs, etc, absent of Relational Integrity, etc.

    对于如何遍历树,他们一无所知,需要递归。

    And they are clueless as to how to traverse the tree, that one needs recursion.

    嵌套集

    直接从地狱流产。记录归档系统中的记录归档系统。这不仅会产生大量的重复操作,而且会破坏规范化规则,从而修复具体文件系统中的记录。

    An abortion, straight from hell. A Record Filing System within a Record Filing system. Not only does this generate masses of duplication and break Normalisation rules, this fixes the records in the filing system in concrete.

    移动单个节点需要重写树的整个受影响部分。钟爱Date,Darwen和Celko类型。

    Moving a single node requires the entire affected part of the tree to be re-written. Beloved of the Date, Darwen and Celko types.

    MS HIERARCHYID 数据类型具有相同的作用。每次节点发生变化时,都会给您大量的混凝土,必须锤击并重新浇筑。

    The MS HIERARCHYID Datatype does the same thing. Gives you a mass of concrete that has to be jack-hammered and poured again, every time a node changes.

    好的,还不算太短。

    这篇关于如何引用关系数据库中的记录组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆