指导主要/外键困境SQL Server [英] Guid Primary /Foreign Key dilemma SQL Server

查看:132
本文介绍了指导主要/外键困境SQL Server的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我面对将主键从int标识改为Guid的困境。我会把问题直截了当。这是一个典型的零售管理应用程序,具有POS和后台功能。有约100张桌子。数据库与其他数据库同步并接收/发送新数据。



大多数表没有在其上执行的频繁插入,更新或选择语句。然而,有些人经常插入和选择它们,例如。产品和订单表。



一些表中最多有4个外键。如果我将主键从int更改为Guid,那么在从具有很多外键的表中插入或查询数据时会出现性能问题。我知道有人说索引会分散,16个字节是一个问题。



我的情况下空间不会是一个问题,显然索引碎片也可以被采用关心使用'NEWSEQUENTIALID()'函数。有人可以从经验中告诉我,如果Guid在许多外键的表格中有问题。



我会非常感谢你的想法...

解决方案

GUID可能似乎是您的主键的自然选择 - 如果您真的必须,您可能会争辩使用它为表的PRIMARY KEY。我强烈建议不做使用GUID列作为集群密钥,默认情况下,SQL Server将执行此操作,除非您不具体告知它。 p>

你真的需要分开两个问题:



1)主键是逻辑结构 - 唯一可靠地识别表中每一行的候选键之一。这可以是任何事情,真的 - 一个INT,一个GUID,一个字符串 - 选择对你的场景最有意义。



2)聚类键(在表中定义聚集索引的列或列) - 这是一个与物理存储相关的东西,这里,一个小型,稳定的,不断增长的数据类型是最佳选择 - INT或BIGINT作为您的默认选项。



默认情况下,SQL Server表上的主键也用作集群键 - 但不需要这样!将以前的基于GUID的主/群集密钥分解成两个单独的密钥 - GUID上的主(逻辑)密钥和单独的INT标识(1, 1)列。



As Kimberly Tripp - 索引的女王 - 和其他人已经说了很多次 - 一个GUID作为聚类键不是最佳的,因为由于它的随机性,它将导致大量页面和索引碎片,并且性能普遍不佳。



是的,我知道 - 有$ code> newsequentialid()在SQL Server 2005和更高版本 - 但即使这不是真正和完全顺序的,因此也受到与GUID相同的问题 - 只是稍微不那么突出。



那么还有一个需要考虑的问题:表上的聚集键将被添加到表上每个非聚集索引的每一个条目上,因此你真的想确保它尽可能的小。通常,对于绝大多数表,具有2-10亿行的INT应该足够,并且与作为集群密钥的GUID相比,可以在磁盘和服务器内存中节省数百兆字节的存储空间。



快速计算 - 使用INT与GUID作为主要和聚集密钥:




  • 1'000'000行的基数表(3.8 MB vs. 15.26 MB)

  • 6个非聚簇索引(22.89 MB vs. 91.55 MB)



    • TOTAL:25 MB vs. 106 MB - 这只是在一张桌子上!



      更多的食物为思想 - 金佰利Tripp的优秀的东西 - 读它,再读一遍,消化!这是SQL Server索引福音,真的。





      所以如果你真的必须将主键更改为GUID - 尝试确保主键不是聚类键,并且在表上仍然有一个INT IDENTITY字段用作聚类密钥。否则,你的表现肯定会坦然而临。


      I am faced with the dilemma of changing my primary keys from int identities to Guid. I'll put my problem straight up. It's a typical Retail management app, with POS and back office functionality. Has about 100 tables. The database synchronizes with other databases and receives/ sends new data.

      Most tables don't have frequent inserts, updates or select statements executing on them. However, some do have frequent inserts and selects on them, eg. products and orders tables.

      Some tables have upto 4 foreign keys in them. If i changed my primary keys from 'int' to 'Guid', would there be a performance issue when inserting or querying data from tables that have many foreign keys. I know people have said that indexes will be fragmented and 16 bytes is an issue.

      Space wouldn't be an issue in my case and apparently index fragmentation can also be taken care of using 'NEWSEQUENTIALID()' function. Can someone tell me, from there experience, if Guid will be problematic in tables with many foreign keys.

      I'll be much appreciative of your thoughts on it...

      解决方案

      GUIDs may seem to be a natural choice for your primary key - and if you really must, you could probably argue to use it for the PRIMARY KEY of the table. What I'd strongly recommend not to do is use the GUID column as the clustering key, which SQL Server does by default, unless you specifically tell it not to.

      You really need to keep two issues apart:

      1) the primary key is a logical construct - one of the candidate keys that uniquely and reliably identifies every row in your table. This can be anything, really - an INT, a GUID, a string - pick what makes most sense for your scenario.

      2) the clustering key (the column or columns that define the "clustered index" on the table) - this is a physical storage-related thing, and here, a small, stable, ever-increasing data type is your best pick - INT or BIGINT as your default option.

      By default, the primary key on a SQL Server table is also used as the clustering key - but that doesn't need to be that way! I've personally seen massive performance gains when breaking up the previous GUID-based Primary / Clustered Key into two separate key - the primary (logical) key on the GUID, and the clustering (ordering) key on a separate INT IDENTITY(1,1) column.

      As Kimberly Tripp - the Queen of Indexing - and others have stated a great many times - a GUID as the clustering key isn't optimal, since due to its randomness, it will lead to massive page and index fragmentation and to generally bad performance.

      Yes, I know - there's newsequentialid() in SQL Server 2005 and up - but even that is not truly and fully sequential and thus also suffers from the same problems as the GUID - just a bit less prominently so.

      Then there's another issue to consider: the clustering key on a table will be added to each and every entry on each and every non-clustered index on your table as well - thus you really want to make sure it's as small as possible. Typically, an INT with 2+ billion rows should be sufficient for the vast majority of tables - and compared to a GUID as the clustering key, you can save yourself hundreds of megabytes of storage on disk and in server memory.

      Quick calculation - using INT vs. GUID as Primary and Clustering Key:

      • Base Table with 1'000'000 rows (3.8 MB vs. 15.26 MB)
      • 6 nonclustered indexes (22.89 MB vs. 91.55 MB)

      TOTAL: 25 MB vs. 106 MB - and that's just on a single table!

      Some more food for thought - excellent stuff by Kimberly Tripp - read it, read it again, digest it! It's the SQL Server indexing gospel, really.

      So if you really must change your primary keys to GUIDs - try to make sure the primary key isn't the clustering key, and you still have an INT IDENTITY field on the table that is used as the clustering key. Otherwise, your performance is sure to tank and take a severe hit .

      这篇关于指导主要/外键困境SQL Server的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆