EF 4.0 Guid或Int作为主要关键 [英] EF 4.0 Guid or Int as A primary Key

查看:139
本文介绍了EF 4.0 Guid或Int作为主要关键的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



是否有任何理由为什么我应该使用Guid作为用户表中的主键?



据我所知,Int作为SQL Server上的PK比字符串更有效。



而Int更容易迭代。
另外,为了安全起见,如果我需要传递任何int id,例如url,我可以以某种方式对其进行加密,并像没有问题的字符串一样传递。



但是如果我想使用自动生成的Guid在SQL Server端使用EF 4.0,我需要做这个技巧 http://leedumond.com/blog/using-a-guid-as-an-entitykey-in-entity-framework-4/



我看不到任何情况,为什么我应该使用Guid作为PK,如果系统要拥有数百万的用户,那么可能只有一个,而且理论上也可以是Guid有时候是不重复的?



无论如何,Int32的大小是2,147.483.647,这对于非常大的系统来说几乎是相当的,但是如果这个数字还不够我可能会用Int64,在这种情况下,我可能有9,223.372.036.854.775.807行。几乎是吗?



另一方面,M $在他们的ASPNetMembership实现中使用Guids作为PK。 [aspnet_Users] - > PK UserId类型uniqueidentifier,
应该是一些原因/解释为什么这样做?!



可能有一些有任何想法/经验?

解决方案

我同意100%与你 - 使用 INT IDENTITY 更好!



GUID可能似乎是您的主键的自然选择 - 如果您真的必须,您可能会争辩将其用于表的PRIMARY KEY。我强烈建议不做使用GUID列作为集群密钥,默认情况下,SQL Server将执行此操作,除非您不具体告知它。 p>

你真的需要分开两个问题:



1)主键是逻辑结构 - 唯一可靠地识别表中每一行的候选键之一。这可以是任何事情,真的 - 一个INT,一个GUID,一个字符串 - 选择对你的场景最有意义。



2)聚类键(在表中定义聚集索引的列或列) - 这是一个与物理存储相关的东西,这里,一个小型,稳定的,不断增长的数据类型是最佳选择 - INT或BIGINT作为您的默认选项。



默认情况下,SQL Server表上的主键也用作集群键 - 但不需要这样!将以前的基于GUID的主/群集密钥分解成两个单独的密钥 - GUID上的主(逻辑)密钥和单独的INT IDENTITY上的聚类(排序)密钥,我亲自看到了巨大的性能提升(1, 1)列。



As Kimberly Tripp - 索引的女王 - 和其他人已经说了很多次 - 一个GUID作为聚类键不是最佳的,因为由于它的随机性,它将导致大量页面和索引碎片,并且性能普遍不佳。



是的,我知道 - 有$ code> newsequentialid()在SQL Server 2005和更高版本 - 但即使这不是真正和完全顺序的,因此也受到与GUID相同的问题 - 只是稍微不那么突出。



那么还有一个需要考虑的问题:表上的聚集键将被添加到表上每个非聚集索引的每一个条目上,因此你真的想确保它尽可能的小。通常,对于绝大多数表,具有2-10亿行的INT应该足够,并且与作为集群密钥的GUID相比,可以在磁盘和服务器内存中节省数百兆字节的存储空间。



快速计算 - 使用INT与GUID作为主要和聚集密钥:




  • 1'000'000行的基数表(3.8 MB vs. 15.26 MB)

  • 6个非聚簇索引(22.89 MB vs. 91.55 MB)



    • TOTAL:25 MB vs. 106 MB - 这只是在一张桌子上!



      更多的食物为思想 - 金佰利Tripp的优秀的东西 - 读它,再读一遍,消化!这是SQL Server索引福音,真的。




      I am Implementing custom ASPNetMembership using EF 4.0

      Is there any reason why i should use Guid as a primary key in User tables?

      As far as i know Int as a PK on SQL Server more performanced than strings.

      And Int is easier to iterate. Also, for security purpose if i need to pass any int id somewhere e.g in url i may encrypt it somehow and pass it like a string with no probs.

      But if i want to use auto generated Guid on SQL Server side using EF 4.0 i need to do this trick http://leedumond.com/blog/using-a-guid-as-an-entitykey-in-entity-framework-4/

      I can't see any cases why i should use Guid as PK, may be only one if system going to have millions ans millions users, but also, theoretically, Guid could be duplicated sometime isn't so?

      Anyway Int32 size is 2,147.483.647 it is pretty much even for very-very big system, but if this number is still not enough I may go with Int64, in that cases I may have 9,223.372.036.854.775.807 rows. Pretty much huh?

      From another hand, M$ using Guids as PK in their ASPNetMembership implementation. [aspnetdb].[aspnet_Users] -> PK UserId Type uniqueidentifier, should be some reasons/explanation why the did it?!

      May be some one has any ideas/experience about that?

      解决方案

      I would agree 100% with you - using an INT IDENTITY is much better!

      GUIDs may seem to be a natural choice for your primary key - and if you really must, you could probably argue to use it for the PRIMARY KEY of the table. What I'd strongly recommend not to do is use the GUID column as the clustering key, which SQL Server does by default, unless you specifically tell it not to.

      You really need to keep two issues apart:

      1) the primary key is a logical construct - one of the candidate keys that uniquely and reliably identifies every row in your table. This can be anything, really - an INT, a GUID, a string - pick what makes most sense for your scenario.

      2) the clustering key (the column or columns that define the "clustered index" on the table) - this is a physical storage-related thing, and here, a small, stable, ever-increasing data type is your best pick - INT or BIGINT as your default option.

      By default, the primary key on a SQL Server table is also used as the clustering key - but that doesn't need to be that way! I've personally seen massive performance gains when breaking up the previous GUID-based Primary / Clustered Key into two separate key - the primary (logical) key on the GUID, and the clustering (ordering) key on a separate INT IDENTITY(1,1) column.

      As Kimberly Tripp - the Queen of Indexing - and others have stated a great many times - a GUID as the clustering key isn't optimal, since due to its randomness, it will lead to massive page and index fragmentation and to generally bad performance.

      Yes, I know - there's newsequentialid() in SQL Server 2005 and up - but even that is not truly and fully sequential and thus also suffers from the same problems as the GUID - just a bit less prominently so.

      Then there's another issue to consider: the clustering key on a table will be added to each and every entry on each and every non-clustered index on your table as well - thus you really want to make sure it's as small as possible. Typically, an INT with 2+ billion rows should be sufficient for the vast majority of tables - and compared to a GUID as the clustering key, you can save yourself hundreds of megabytes of storage on disk and in server memory.

      Quick calculation - using INT vs. GUID as Primary and Clustering Key:

      • Base Table with 1'000'000 rows (3.8 MB vs. 15.26 MB)
      • 6 nonclustered indexes (22.89 MB vs. 91.55 MB)

      TOTAL: 25 MB vs. 106 MB - and that's just on a single table!

      Some more food for thought - excellent stuff by Kimberly Tripp - read it, read it again, digest it! It's the SQL Server indexing gospel, really.

      这篇关于EF 4.0 Guid或Int作为主要关键的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆