MySQL - 我应该在每个子表上使用多列主键吗? [英] MySQL - Should I use multi-column primary keys on every child table?

查看:121
本文介绍了MySQL - 我应该在每个子表上使用多列主键吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

设置:



当我在stackexchange上找到这篇伟大的文章时,我试图了解识别和非识别关系之间的区别。 识别和非识别关系有什么区别? ?



阅读了几条评论后,我再次提出了一个问题,介绍一直遇到的问题。






问题:



我应该在每个子表上使用多列主键,哪些优点/为了更好地说明我的问题,我在下面创建了一个例子。我也列出了让我提出这个问题的意见。






示例:



在我的情况下,我知道 building_id ,我需要获得 bed.data



#1 - 我当前的数据库结构

  TABLE {FIELDS} 
-------------------------------------- ---------------------------------
building {id,data}
floor {id ,building_id,data}
room {id,floor_id,data}
bed {id,room_id,data}

这种类型的表结构将需要我使用几个连接来获取我需要的数据。我没有什么大不了的事情,因为我遇到这种情况很痛苦。



#2 - 我对Bill Karwin建议的DB结构的解释(见文章

  TABLE {FIELDS} 
---------- -------------------------------------------------- -----------
building {id,data}
floor {id,building_id,data}
room {id,building_id,floor_id,data}
bed {id,building_id,floor_id,room_id,data}

这个表结构似乎消除了在我的情况下加入。那么这个表结构有什么缺点呢?我真的很喜欢没有这么多加入声明的想法。






来自文章的评论:




识别和非识别关系有什么区别?



@hobodave:这是配置约定参数。一些思想是,每个表都应该为自动生成其值的单列pseudokey命名的id定义其主键。像Rails这样的应用程序框架已将其作为默认设置。他们将自然键和多列键视为不同于使用遗留数据库所需的惯例。许多其他框架都是这样的。 - Bill Karwin 3月10日10时23分06分



似乎正确构建识别关系将导致令人讨厌的巨大主键。例如大楼有楼有房有床。床的PK将是(bed_id,floor_id,room_id,building_id)。看起来很奇怪,我从来没有在实践中看到这个,也没有听说它是一种做任何事情的方法。这是PK中的大量冗余数据。 - hobodave 2010年3月10日23:34



@hobodave:我看到更多的多列主键。但我想你的观点考虑多列主键传达更多的信息;您可以在特定建筑物中查询床桌上的所有床,而不进行任何连接。 - Bill Karwin 3月11日,在1:00



解决方案

p>

  TABLE {FIELDS} 
-------------------- -------------------------------------------------- -
building {id,data}
floor {id,building_id,data}
room {id,floor_id,data}
bed {id,room_id,data}

这个表不是(坏主意)

  TABLE {FIELDS} 
---------------------------------- -------------------------------------
building {id,data}
floor {id,building_id,data}
room {id,building_id,floor_id,data}
bed {id,building_id,floor_id,room_id,data}




  1. 在第一个(好的)表中,没有不必要的重复数据。

  2. 第一张表中的插入将会更快。

  3. 第一张表将更容易在内存中加快查询速度。

  4. InnoDB考虑到了A型,而不是B型。

  5. 后者(坏)表具有重复数据, if 不同步,你会有一团糟。由于数据只列出一次,DB A不能更加难以脱离同步。

  6. 如果我想从建筑物,楼层,房间和床上组合数据,我将需要组合模型A中的所有四个表以及模型B,你怎么在这里节省时间

  7. InnoDB将索引数据存储在自己的文件中,如果您选择仅索引,表本身将永不被访问。那么你为什么要复制索引呢? MySQL绝对不需要阅读主表。

  8. InnoDB将PK 存储在每个次要索引中,复合并且因此长的PK,您正在放慢每个使用索引的选择,并使文件大小;因为没有得到什么。

  9. 你有严重的速度问题吗?如果没有,你是否对你的桌子进行非规范化?

  10. 甚至不要考虑使用MyISAM,这些问题的受益程度较低,但是并没有针对多连接数据库进行优化,并且不支持引用的整合或交易,并且与此不兼容工作量

  11. 使用复合键时,您只能使用键的最右边部分,即不能使用 floor_id 除了使用 id + building_id + floor_id 之外,表,这意味着您可能需要使用更多的密钥空间或者你需要添加一个额外的索引(这将会围绕PK的完整副本)。

简而言之

我看到绝对零利益和很多缺点在B型,从不使用它!


Setup:

I was trying to understand the difference between identifying and non-identifying relationships when I found this great article on stackexchange. What's the difference between identifying and non-identifying relationships?

After reading a few comments it brought another question to mind about a problem I have been having.


Question:

Should I use multi-column primary keys on every child table and what are the advantages/disadvantages to doing so?

To better illustrate my question I have created an example below. I also included the comments that caused me to ask this question.


Example:

In my situation, I know the building_id and I need to get bed.data.

#1 - My current DB structure:

TABLE { FIELDS }
-----------------------------------------------------------------------
building { id, data } 
floor { id, building_id, data }
room {id, floor_id, data }
bed {id, room_id, data }

This type of table structure would require me to use a few joins to get the data I need. Not a big deal but kind of a pain since I run into this situation a lot.

#2 - My interpretation of Bill Karwin's suggested DB structure (see article comments below):

TABLE { FIELDS }
-----------------------------------------------------------------------
building { id, data } 
floor { id, building_id, data }
room {id, building_id, floor_id, data }
bed {id, building_id, floor_id, room_id, data }

This table structure seems to eliminate the need for joins in my situation. So what are the disadvantages to this table structure? I really like the idea of not doing so many join statements.


Comments From Article:

What's the difference between identifying and non-identifying relationships?

@hobodave: It's the "convention over configuration" argument. Some schools of thought are that every table should define its primary key for a single-column pseudokey named id that auto-generates its values. Application frameworks like Rails have popularized this as a default. They treat natural keys and multi-column keys as divergent from their conventions, needed when using "legacy" databases. Many other frameworks have followed this lead. – Bill Karwin Mar 10 '10 at 23:06

It seems like "properly" constructing identifying relationships would lead to obnoxiously huge primary keys. e.g. Building has Floor has Room has Bed. The PK for Bed would be (bed_id, floor_id, room_id, building_id). It seem's strange that I've never seen this in practice, nor heard it suggested as a way to do anything. That's a lot of redundant data in the PK. – hobodave Mar 10 '10 at 23:34

@hobodave: I have seen multi-column primary keys that are even larger. But I take your point. Consider that multi-column primary keys convey more information; you can query the Beds table for all beds in a specific building without doing any joins. – Bill Karwin Mar 11 '10 at 1:00

解决方案

This data is normalized

TABLE { FIELDS }
-----------------------------------------------------------------------
building { id, data } 
floor { id, building_id, data }
room {id, floor_id, data }
bed {id, room_id, data }

This table is not (bad idea)

TABLE { FIELDS }
-----------------------------------------------------------------------
building { id, data } 
floor { id, building_id, data }
room {id, building_id, floor_id, data }
bed {id, building_id, floor_id, room_id, data }

  1. In the first (good) table you do not have unneeded duplicated data.
  2. Inserts in the first table will be much faster.
  3. The first tables will fit more easily in memory, speeding up your queries.
  4. InnoDB is optimized with model A in mind, not with model B.
  5. The latter (bad) table has duplicated data, if that gets out of sync, you will have a mess. DB A cannot is much harder to get out of sync, because the data is only listed once.
  6. If I want to combine data from the building, floor, room and bed I will need to combine all four tables in model A as well as model B, how are you saving time here.
  7. InnoDB stores indexed data in its own file, if you select only indexes, the tables themselves will never be accessed. So why are you duplicating the indexes? MySQL will never need to read the main table anyway.
  8. InnoDB stores the PK in each an every secondary index, with a composite and thus long PK, you are slowing down every select that uses an index and balooning the filesize; for no gain what so ever.
  9. Do you have serious speed problem? If not, you are you denormalizing your tables?
  10. Don't even think about using MyISAM which suffers less from these issues, it is not optimized for multi-join databases and does not support referential intregrity or transactions and is a poor match for this workload.
  11. When using a composite key you can only ever use the rightmost-part of the key, i.e. you cannot use floor_id in table bed other than using id+building_id+floor_id, This means that you may have to use much more key-space than needed in Model A. Either that or you need to add an extra index (which will drag around a full copy of the PK).

In short
I see absolutly zero benefit and a whole lot of drawbacks in Model B, never use it!

这篇关于MySQL - 我应该在每个子表上使用多列主键吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆