替代钥匙与EF的自然钥匙 [英] Surrogate Key vs Natural Key for EF
问题描述
哪个更好的设计?我的同事说,如果我们有一个替代密钥,当我们需要向用户显示数据时,在桌面上进行连接将会很慢。另外,拥有一个仅仅是为了让行独一无二的键是浪费的。我的观点是连接速度很快,并且为3个varchars存储额外的数据是浪费的,因为它将两个表中的数据重复。
我们正在WPF桌面应用程序中使用EF 5,在T-SQL Server 2008中。代理键或自然键?附图显示了两种不同的设计。
只有几千行在桌子上,我不认为你会注意到任何差异。即使一个表有数百万行,另一个也就只有你说的只有700.而且SQL-Server的设计很有效率地加入,所以当他声称加入到相对较小的(700行)表将影响效率。
设计A比B更好的一个方面是,较大的表(PriceIndex)将变窄,因此用于加盟的索引。 4字节而不是90可以使性能有很大的提高。而您可能需要的其他组合索引,包括代理密钥在设计A中也将比B更窄。
设计B将更有效率的情况A是来自两个表的 GROUP BY
列的查询。例如,如果您有一个 GROUP BY Price,HubCode
的查询,在设计B中,您可以在这两列中添加一个复合索引,而在设计A中列将分开表,您不能拥有2个表中的列的索引。
另一个方面是是否有其他表与这些列作为主键,说如果你有另一个表为(HubCode)
作为PK,另一个与(HubCode,TimeFrame)
另一个与,另外可以使用
(IndexCode,HubCode)使用设计B(所有表都具有自然键),涉及多个表的连接的多个复杂查询可以更有效,因为可以消除一些中间联接。使用设计A(代理键),中间连接不能被跳过,当(中间)表很大时,查找成本可能会增长很大。
最后,没有重要的不仅仅是测试您的数据以及您希望表格增长的大小以及您希望运行的查询类型。
My co-worker and I are trying to decide which is a better way to design the schema and keys for two database tables. One is a lookup-table that rarely changes. It has about 700 rows. The other table references the lookup-table. This table will have many thousand rows over time. In Design B, the lookup table has a primary key consisting of 3 varchars. The other table has a primary key consisting of the same 3 varchars with the addition of two date fields. In Design A, the 3 varchars are replaced with a surrogate key. The 3 varchars have a unique constraint (UC) on them.
Which is a better design? My co-worker says that if we have a surrogate key, doing joins on the tables will make this very slow when we need to display data to the users. Also, having a key that only is only for making the row unique is wasteful. My argument is that joins are fast and storing extra data for 3 varchars is wasteful because it duplicates this data in two tables.
We are using this in a WPF desktop application with EF 5, in T-SQL Server 2008. Surrogate Key or Natural Key? The attached image shows the two different designs.
With only a few thousand rows on the tables, I don't think you will notice any difference. And even if one the tables have millions of rows, the other will have as you say only 700. And SQL-Server is pretty much designed to do joins efficiently, so your co-worker is not correct when he claims that a join to a rather small (700 rows) table will affect efficiency.
One aspect that design A is better than B is that the bigger table (PriceIndex) will be narrower and so be the indexes used for the joining. 4 bytes instead of 90 can benefit performance a lot. And every other composite index you may need that includes the surrogate key will be narrower, too, in design A than in B.
A situation where design B will be more efficient than A is queries that involve GROUP BY
columns from both tables. If for example you have a query with GROUP BY Price, HubCode
, in design B you can add a composite index on these 2 columns while in design A the columns will be in separate tables and you can't have an index with columns from 2 tables.
Another aspect is whether there are other tables with these columns as primary keys, say if you have another table with (HubCode)
as the PK and another with (HubCode, TimeFrame)
and another with (IndexCode, HubCode)
and maybe another with (IndexCode, HubCode, TimeFrame, StartDate, EndDate, CustomerID)
. With design B (all tables having natural keys), several complex queries involving joins from multiple tables can be more efficient as some intermediate joins can be eliminated. With design A (surrogate keys), intermediate joins cannot be skipped and the lookup costs can grow quite large when the (intermediate) tables are large.
In the end, nothing matters more than testing with your data and the sizes you expect your tables to grow and the type of queries you expect to run.
这篇关于替代钥匙与EF的自然钥匙的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!