我的数据结构如何? [英] How's my data structure?
问题描述
我已经建立了一个简单的数据库来存储有关奖项和提名的信息.我试图尽可能多地删除数据冗余.这是它目前的样子:
I've put together a simple database for storing information on awards and nominations. I've tried to remove as much data redundancy as possible. Here's how it's presently looking:
提名表的原因是我意识到一个提名会有很多被提名者.例如,Best Screenplay
奖可以授予 Ken Levine 和 David Isaacs
或 Woody Allen
或 Joss Whedon、Andrew Stanton、Joel科恩和亚历克·索科洛
.
The reason for the Nominated table is that I realised that one nomination would have many nominees. For example, the award Best Screenplay
could go to Ken Levine and David Isaacs
or Woody Allen
or Joss Whedon, Andrew Stanton, Joel Cohen and Alec Sokolow
.
注意:Award.name
是奖项的名称,例如最佳男主角
.
Note: Award.name
is the name of the award, e.g. Best Actor
.
感谢您指出任何可能的改进.
Thanks for pointing out any possible improvements.
推荐答案
小注意事项
我更喜欢单数的表名和列名,所以
Nominees
变成Nominee
,Awards
变成Award
等
将 Award
表重命名为 AwardCategory
,如评论中建议的@wildplasser.
Renaming the Award
table as AwardCategory
as @wildplasser suggested in comments.
<小时>
主要笔记
Major notees
正如@Olivier 指出的那样,m::n
关系中间表,就像 Nominated
一样,将对 UNIQUE
约束复合 (NomineeId, NominationId)
.因此,最好删除自动生成(代理)键并将复合键设为 PRIMARY KEY
.这是关系的自然键,使用它作为主键有几个优点.surrogate key 在这种情况下根本没有任何用处,除了有更宽的行和更多无用的指数.无论如何,自然键的两部分将用于连接.
As @Olivier points out, the m::n
relationships intermediate tables, like the Nominated
one, will have a UNIQUE
constraint on the compound (NomineeId, NominationId)
. So, it's better to drop the auto generated (surrogate) key and make the compound key the PRIMARY KEY
. This is the natural key of the relation and there are several advantages of using it as the Primary Key. The surrogate key serves no purpose at all in this case except for having wider row and one more useless index. The two parts of the natural keys will be used for joining anyway.
同样的事情也适用于 Nomination
表!复合 (FilmId, AwardCategoryId, EventId)
将是一个 UNIQUE
键,以确保没有电影在同一事件的同一奖项类别中获得 2 项提名,因此再次最好删除代理键并使这个复合键成为主键.重新思考,我们可能有 2 个提名为同一部电影的同一个奖项类别,比如两个 '最佳男配角'
所以我们在主键中添加一个 NominatioNo
(这可以如果我们想限制某个类别的提名或让所有人都说常数 5,那么稍后会很方便.
The same thing applies for the Nomination
table! The compound (FilmId, AwardCategoryId, EventId)
will be a UNIQUE
key, to ensure that no film gets 2 nominations for the same award category for the same event, so it's again better to drop the surrogate key and make this compound the primary key. Rethinking, we may have 2 nominations for the same AwardCategory for the same Film, say for two 'Best Supporting Actor'
so we add a NominatioNo
in the Primary Key (this can be handy later if we want to restrict the nominations for a certain category or for all to say the constant 5).
现在,(有趣又有趣)的事情是必须重新检查 Nominated
表并有一个复合 (NomineedId, FilmId, AwardCategoryId, EventId)
主键 - 仅这 4 列作为属性.
Now, the (funny and interesting) thing is that the Nominated
table has to be re-examined and have a compound (NomineedId, FilmId, AwardCategoryId, EventId)
Primary Key - and just these 4 columns as attributes.
我不确定 Event
和 Ceremony
表究竟要存储什么,但让我们假设 Ceremony
表是用于存储有关不同仪式的信息(例如 'Oscar Awards'
、'Strawberry Awards'
),而 Event
表用于存储有关某个仪式的信息年度颁奖典礼(例如 ('Oscar', 2011), ('Oscar', 2012), ('Starwberry Awards', 2012)
).因此,我将把 Year
移到 Event
表中,并使 (CeremonyId, EventYear)
成为事件的主键.(我很可能错了,你更了解你的数据.).
I'm not sure of what exactly the Event
and Ceremony
table are meant to store, but lets assume that the Ceremony
table is meant to store information about different ceremonies (e.g. 'Oscar Awards'
, 'Strawberry Awards'
) and the Event
table is to store information about a year's ceremony (e.g. ('Oscar', 2011), ('Oscar', 2012), ('Starwberry Awards', 2012)
). So i'll move the Year
to the Event
table and make the (CeremonyId, EventYear)
the Priamry Key of Event. (I could very well be wrong this, you know your data better.).
因此,Nomination.EventId
被替换为 CeremonyId
和 EventYear
以及 Nomination
的主键和 提名
变得更长!(这是使用自然键作为主键的一个缺点).让我们看看到目前为止我们得到了什么:
So, the Nomination.EventId
is replaced by CeremonyId
and EventYear
and the Primary Keys of both Nomination
and Nominated
get even longer! (that's one drawback of using natural keys as Primary Keys). Lets see what we've got so far:
数据库设计 1 http://img594.imageshack.us/img594/9592/oscarw.png
您可以轻松添加一个NominationWinner
(作为一个与Nomination
具有1:1
关系的表)来存储哪个提名赢得了哪个类别((CeremonyId, EventYear, AwardCategoryId)
上的唯一约束将强制执行).设计应该是这样的:
You can easily add a NominationWinner
(as a table with 1:1
relationship to Nomination
) to store which nomination won which category (a Unique constraint on (CeremonyId, EventYear, AwardCategoryId)
would enforce that). The design would be like this:
数据库设计 1 http://img845.imageshack.us/img845/2108/oscar3x.png
拥有如此复杂的主键可能看起来很笨拙,但在连接表时它会有所帮助.想象一下,您想要查找 50 年代和 60 年代草莓奖"的所有获奖者,并且仅针对女演员"类别,并显示该奖项授予的电影.您不必加入所有中间表.相反,您可以仅使用 NominationWinner
、Nominee
、Ceremony
、Film
和 AwardCategory<检索数据/code> 表(并且仅使用
Nominated
中间表):
Having so complex primary keys may look clumsy but it helps when joining tables. Imagine you want to find all Winners for the 'Strawberry Awards' for the 50s and 60s and only for the 'Actresses' categories and also show for what film the award was for. You don't have to join all intermediate tables. Instead, you can retrive data using only the NominationWinner
, Nominee
, Ceremony
, Film
and AwardCategory
tables (and using only the Nominated
intermediate table):
SELECT ne.Name AS Winner
, wi.EventYear AS Year
, aw.AwardCategoryTitle AS Category
, fm.Title AS FilmTitle
FROM
NominationWinner AS wi
JOIN
Ceremony AS ce
ON ce.CeremonyId = wi.CeremonyId
JOIN
AwardCategory AS aw
ON aw.AwardCategoryId = wi.AwardCategoryId
JOIN
Film AS fm
ON fm.FilmId = wi.FilmId
JOIN
Nominated nd
ON nd.CeremonyId = wi.CeremonyId
AND nd.EventYear = wi.EventYear
AND nd.AwardCategory = wi.AwardCategory
AND nd.NominationNo = wi.NominationNo
AND nd.FilmId = wi.FilmId
JOIN
Nominee AS ne
ON ne.NomineeId = nd.NomineeId
WHERE
ce.CeremonyTitle = 'Strawberry Awards'
AND wi.EventYear BETWEEN 1950 AND 1969
AND aw.AwardCategoryTitle LIKE '%Actress%'
这篇关于我的数据结构如何?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!