如何在表上建立多个m:n关系的数据库 [英] How to model a database with many m:n relations on a table

查看:308
本文介绍了如何在表上建立多个m:n关系的数据库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在建立一个数据库,它具有大量的多对多关系。每个关系都通过链接表建模。例如:



一个人有一些工作,工作由一些人履行。一个人有一些房屋,房屋被一些人占用。一个人有一些他喜欢的餐馆,餐厅有很多人喜欢这家餐厅。



首先我设计如下:



表:人,工作,房屋,餐厅,Person_Job,Person_House,Person_Restaurant。



关系1 - n:个人 - > Person_Job, - > Person_House,Person - > Person_Restaurant,Job - > Person_Job,House - > Person_House,Restaurant - > Person_Restaurant。



这很快导致拥挤而复杂的ER模型



试图简化这个我建模如下:



表格:人,工作,房子,餐厅,Person_Attributes



关系1 - n:Person - > Person_Attributes,Job - > Person_Attributes,House - > Person_Attributes,Restaurant - > Person_Attributes



Person_Attributes表应如下所示:
personId
jobId
houseId
restaurantId



如果一个人 - 工作关系存在,我会添加一个条目,如下所示:



P1,J1,NULL,NULL



如果存在一个人与房屋关系,我将添加一个条目,如下所示:



P1,NULL,H1,NULL



所以第二个示例中的属性表将具有与添加的第一个示例的链接表相同的条目数。



简单地说,ER模型很多,只要我为personId + jobId,personId + houseId和personId + restaurantId创建索引,我认为不会有太多的性能影响。



我的问题是:
第二种方法是正确的建模方式吗?如果没有,为什么?对于性能影响我是对的吗?如果没有,为什么?为什么?



MySQL Workbench的例子我可以在这里找到:



http://www.2shared.com/file/3GBnodEZ/example.html

解决方案

您的设计违反了第四正常表格。您正试图将多个事实存储在一个表中,并导致异常。


Person_Attributes表应该是这样的:personId jobId houseId restaurantId


所以如果我联系一个工作,一个房子,但是两个餐馆,我存储以下? / p>

  personId jobId houseId restaurantId 
1234 42 87 5678
1234 42 87 9876

如果我添加了第三家餐厅,我会复制其他列?

  personId jobId houseId restaurantId 
1234 123 87 5678
1234 123 87 9876
1234 42 87 13579

完成!哦,等等,发生了什么事?在添加新餐厅的同时我改变了工作。现在我不正确地关联了两个工作,但没有办法区分正确的 正好与两个作业相关联。



另外,即使与两个作业相关联也是正确的,数据看起来不是这样吗?

  personId jobId houseId restaurantId 
1234 123 87 5678
1234 123 87 9876
1234 123 87 13579
1234 42 87 5678
1234 42 87 9876
1234 42 87 13579

它开始看起来像一个笛卡尔乘积。事实上,这是因为这个表试图存储多个独立的事实。



正确的关系设计需要一个单独的交叉表,用于每对多对多的关系。对不起,你还没有找到一个快捷方式。



(关于规范化的许多文章都表示,过去3NF的更高级正常格式是深奥的,而且从来没有必要担心4NF或5NF。让这个例子反驳这个说法。)






重新发表关于使用NULL的评论:那么你有一个强制执行唯一性的问题,因为一个 PRIMARY KEY 约束要求所有列不为空。

  personId jobId houseId restaurantId 
1234 123 87 5678
1234 NULL NULL 9876
1234 NULL NULL 13579

另外,如果我在上表中添加了第二个房子或第二个jobId,我将其放在哪一行?你可以这样做:

  personId jobId houseId restaurantId 
1234 123 87 5678
1234 NULL NULL 9876
1234 42 NULL 13579

现在如果我取消关联restaurantId 9876,我可以将其更新为空值。但是,这留下了一排NULL,我真的应该删除。

  personId jobId houseId restaurantId 
1234 123 87 5678
1234 NULL NULL NULL
1234 42 NULL 13579

而如果我已经解除了餐厅13579的关联,我可以将其更新为NULL,并将该行放在适当位置。

  personId jobId houseId restaurantId 
1234 123 87 5678
1234 NULL NULL 9876
1234 42 NULL NULL

但是,不应该整合行,将jobId移动到另一行,只要有空位在该列中?

  personId jobId houseId restaurantId 
1234 123 87 5678
1234 42 NULL 9876

麻烦的是,现在添加或删除关联越来越复杂,需要多个SQL语句进行更改。您将不得不编写大量繁琐的应用程序代码来处理这种复杂性。



但是,如果您定义一个表,所有各种更改都很容易



向餐厅添加关联只是Person_Restaurant表的 INSERT 。删除该关联只是一个 DELETE 。关于工作或房屋的协会有多少关系。并且您可以在每个这些交集表中定义主键约束以强制执行唯一性。


I am currently setting up a database which has a large number of many-to-many relations. Every relationship was modeled via a link table. Example:

A person has a number of jobs, jobs are fulfilled by a number of persons. A person has a number of houses, houses are occupied by a number of persons. A person has a number of restaurants he likes, restaurants have a number of persons who like the restaurant.

First I designed this as follows:

Tables: Person, Job, House, Restaurant, Person_Job, Person_House, Person_Restaurant.

Relationships 1 - n: Person -> Person_Job, Person -> Person_House, Person -> Person_Restaurant, Job -> Person_Job, House -> Person_House, Restaurant -> Person_Restaurant.

This leads pretty quickly to a crowded and complex ER model.

Trying to simplify this I modeled it as follows:

Tabels: Person, Job, House, Restaurant, Person_Attributes

Relationships 1 - n: Person -> Person_Attributes, Job -> Person_Attributes, House -> Person_Attributes, Restaurant -> Person_Attributes

The Person_Attributes table should look something like this: personId jobId houseId restaurantId

If a person - job relationship exists, I'll add an entry looking like:

P1, J1, NULL, NULL

If a person - house relationship exists, I'll add an entry looking like:

P1, NULL, H1, NULL

So the attributes table in the second example will have the same number of entries as the link tables of the first examples added up.

This simplyfies the ER Model a lot, and as long as I build indexes for personId + jobId, personId + houseId and personId + restaurantId, there won't be a lot of performance impact, I think.

My questions are: Is the second method a correct way of modelling this? If not, why? Am I right about performance impact? If not, why?

MySQL Workbench example of what I mean can be found here:

http://www.2shared.com/file/3GBnodEZ/example.html

解决方案

Your design violates Fourth Normal Form. You're trying to store multiple "facts" in one table, and it leads to anomalies.

The Person_Attributes table should look something like this: personId jobId houseId restaurantId

So if I associate with one job, one house, but two restaurants, do I store the following?

personId jobId houseId restaurantId
    1234    42      87         5678
    1234    42      87         9876

And if I add a third restaurant, I copy the other columns?

personId jobId houseId restaurantId
    1234   123      87         5678
    1234   123      87         9876
    1234    42      87        13579 

Done! Oh, wait, what happened there? I changed jobs at the same time as adding the new restaurant. Now I'm incorrectly associated with two jobs, but there's no way to distinguish between that and correctly being associated with two jobs.

Also, even if it is correct to be associated with two jobs, shouldn't the data look like this?

personId jobId houseId restaurantId
    1234   123      87         5678
    1234   123      87         9876
    1234   123      87        13579 
    1234    42      87         5678
    1234    42      87         9876
    1234    42      87        13579 

It starts looking like a Cartesian product of all distinct values of jobId, houseId, and restaurantId. In fact, it is -- because this table is trying to store multiple independent facts.

Correct relational design requires a separate intersection table for each many-to-many relationship. Sorry, you have not found a shortcut.

(Many articles about normalization say the higher normal forms past 3NF are esoteric, and one never has to worry about 4NF or 5NF. Let this example disprove that claim.)


Re your comment about using NULL: Then you have a problem enforcing uniqueness, because a PRIMARY KEY constraint requires that all columns be NOT NULL.

personId jobId houseId restaurantId
    1234   123      87         5678
    1234  NULL    NULL         9876
    1234  NULL    NULL        13579 

Also, if I add a second house or a second jobId to the above table, which row do I put it in? You could end up with this:

personId jobId houseId restaurantId
    1234   123      87         5678
    1234  NULL    NULL         9876
    1234    42    NULL        13579 

Now if I disassociate restaurantId 9876, I could update it to NULL. But that leaves a row of all NULLs, which I really should just delete.

personId jobId houseId restaurantId
    1234   123      87         5678
    1234  NULL    NULL         NULL
    1234    42    NULL        13579 

Whereas if I had disassociated restaurant 13579, I could update it to NULL and leave the row in place.

personId jobId houseId restaurantId
    1234   123      87         5678
    1234  NULL    NULL         9876
    1234    42    NULL         NULL 

But shouldn't I consolidate rows, moving the jobId to another row, provided there's a vacancy in that column?

personId jobId houseId restaurantId
    1234   123      87         5678
    1234    42    NULL         9876

The trouble is, now it's getting more and more complex to add or remove associations, requiring multiple SQL statements for changes. You're going to have to write a lot of tedious application code to handle this complexity.

However, all the various changes are easy if you define one table per many-to-many relationship. You do need the complexity of having that many more tables, but by doing that you will simplify your application code.

Adding an association to a restaurant is simply an INSERT to the Person_Restaurant table. Removing that association is simply a DELETE. It doesn't matter how many associations there are to jobs or houses. And you can define a primary key constraint in each of these intersection tables to enforce uniqueness.

这篇关于如何在表上建立多个m:n关系的数据库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆