没有主键的表 [英] Tables with no Primary Key

查看:112
本文介绍了没有主键的表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有几个表,其唯一的唯一数据是uniqueidentifier(Guid)列。因为guid是非顺序的(并且它们是客户端生成的所以我不能使用newsequentialid()),所以我在这个ID字段上创建了一个非主要的非聚集索引,而不是给表一个聚簇的主要关键。

I have several tables whose only unique data is a uniqueidentifier (a Guid) column. Because guids are non-sequential (and they're client-side generated so I can't use newsequentialid()), I have made a non-primary, non-clustered index on this ID field rather than giving the tables a clustered primary key.

我想知道这种方法的性能影响。我看到有人建议表应该有一个自动递增(identity)int作为集群主键,即使它没有任何意义,因为这意味着数据库引擎本身可以快速使用该值查找一行而不必使用书签。

I'm wondering what the performance implications are for this approach. I've seen some people suggest that tables should have an auto-incrementing ("identity") int as a clustered primary key even if it doesn't have any meaning, as it means that the database engine itself can use that value to quickly look up a row instead of having to use a bookmark.

我的数据库在一堆服务器上进行合并复制,所以我避开了标识int列他们在复制方面有点毛茸茸。

My database is merge-replicated across a bunch of servers, so I've shied away from identity int columns as they're a bit hairy to get right in replication.

你有什么想法?表应该有主键吗?或者,如果没有合理的列来索引那么没有任何聚簇索引可以吗?

What are your thoughts? Should tables have primary keys? Or is it ok to not have any clustered indexes if there are no sensible columns to index that way?

推荐答案

处理索引时,你必须确定你的桌子将用于什么。如果您主要是每秒插入1000行而不进行任何查询,那么聚簇索引会影响性能。如果您每秒进行1000次查询,那么没有索引将导致非常糟糕的性能。尝试调优查询/索引时,最好的办法是在SQL Server中使用查询计划分析器和SQL事件探查器。这将显示您遇到昂贵的表扫描或其他性能阻止程序的位置。

When dealing with indexes, you have to determine what your table is going to be used for. If you are primarily inserting 1000 rows a second and not doing any querying, then a clustered index is a hit to performance. If you are doing 1000 queries a second, then not having an index will lead to very bad performance. The best thing to do when trying to tune queries/indexes is to use the Query Plan Analyzer and SQL Profiler in SQL Server. This will show you where you are running into costly table scans or other performance blockers.

至于GUID vs ID参数,您可以在网上找到两者都发誓的人。我一直被教导使用GUID,除非我有充分的理由不这样做。 Jeff发表了一篇很好的文章,讨论了使用GUID的原因: http://www.codinghorror .com / blog / archives / 000817.html

As for the GUID vs ID argument, you can find people online that swear by both. I have always been taught to use GUIDs unless I have a really good reason not to. Jeff has a good post that talks about the reasons for using GUIDs: http://www.codinghorror.com/blog/archives/000817.html.

与大多数与开发相关的事情一样,如果您希望提高性能,那么就没有一个,单个正确答案。这实际上取决于您要完成的工作以及如何实施解决方案。唯一真正的答案是再次针对性能指标进行测试,测试和测试,以确保您达到目标。

As with most anything development related, if you are looking to improve performance there is not one, single right answer. It really depends on what you are trying to accomplish and how you are implementing the solution. The only true answer is to test, test, and test again against performance metrics to ensure that you are meeting your goals.


@Matt,在对GUID / ID辩论做了更多研究后,我发现了这篇文章。就像我之前提到的那样,没有一个真正的正确或错误的答案。这取决于您的具体实施需求。但这些是使用GUID作为主键的一些非常有效的理由:

@Matt, after doing some more research on the GUID/ID debate I came across this post. Like I mentioned before, there is not a true right or wrong answer. It depends on your specific implementation needs. But these are some pretty valid reasons to use GUIDs as the primary key:


例如,存在称为热点的问题,表中的某些数据页面处于相对较高的货币争用中。基本上,会发生什么事情是桌子上的大部分流量(以及因此页面级锁定)发生在桌子的一小块区域上,朝向末端。新记录将始终转到此热点,因为IDENTITY是序列号生成器。这些插入很麻烦,因为它们需要在添加到页面上的Exlusive页面锁定(热点)。由于页面锁定机制,这有效地将所有插入序列化到表中。另一方面,NewID()不受热点影响。使用NewID()函数生成的值仅对插入的短突发(其中函数被非常快速地调用,例如在多行插入期间)顺序,这导致插入的行在整个表的数据页中随机扩散最后的所有内容 - 从而消除了插入的热点。

For example, there is an issue known as a "hotspot", where certain pages of data in a table are under relatively high currency contention. Basically, what happens is most of the traffic on a table (and hence page-level locks) occurs on a small area of the table, towards the end. New records will always go to this hotspot, because IDENTITY is a sequential number generator. These inserts are troublesome because they require Exlusive page lock on the page they are added to (the hotspot). This effectively serializes all inserts to a table thanks to the page locking mechanism. NewID() on the other hand does not suffer from hotspots. Values generated using the NewID() function are only sequential for short bursts of inserts (where the function is being called very quickly, such as during a multi-row insert), which causes the inserted rows to spread randomly throughout the table's data pages instead of all at the end - thus eliminating a hotspot from inserts.

此外,由于插入是随机分布的,因此页面拆分的可能性大大降低。虽然页面在这里拆分并且不太糟糕,但效果会很快加起来。使用IDENTITY,页面填充因子作为调整机制是非常无用的,并且可能也设置为100% - 行永远不会插入任何页面而是最后一页。使用NewID(),您实际上可以使用填充因子作为性能启用工具。您可以将填充因子设置为接近索引重建之间的估计体积增长的级别,然后使用dbcc reindex在非高峰时段安排重建。这有效地延迟了页面拆分的性能命中,直到非高峰时间。

Also, because the inserts are randomly distributed, the chance of page splits is greatly reduced. While a page split here and there isnt too bad, the effects do add up quickly. With IDENTITY, page Fill Factor is pretty useless as a tuning mechanism and might as well be set to 100% - rows will never be inserted in any page but the last one. With NewID(), you can actually make use of Fill Factor as a performance-enabling tool. You can set Fill Factor to a level that approximates estimated volume growth between index rebuilds, and then schedule the rebuilds during off-peak hours using dbcc reindex. This effectively delays the performance hits of page splits until off-peak times.

如果您甚至,您可能需要为表启用复制有问题 - 那么你不妨让PK成为一个uniqueidentifier并将guid字段标记为ROWGUIDCOL。复制将需要具有此属性的唯一值guid字段,如果不存在,则将添加一个。如果存在合适的字段,那么它将只使用那里的字段。

If you even think you might need to enable replication for the table in question - then you might as well make the PK a uniqueidentifier and flag the guid field as ROWGUIDCOL. Replication will require a uniquely valued guid field with this attribute, and it will add one if none exists. If a suitable field exists, then it will just use the one thats there.

使用GUID进行PK的另一个巨大好处是这个值确实保证唯一 - 不仅是服务器生成的所有值,还包括所有计算机生成的所有值 - 无论是您的数据库服务器,Web服务器,应用服务器还是客户端计算机。几乎所有现代语言都能够生成有效的guid - 在.NET中,您可以使用System.Guid.NewGuid。特别是在处理缓存的主 - 细节数据集时,这非常方便。您不必采用疯狂的临时键控方案,只是为了在提交之前将您的记录关联在一起。您只需从操作系统中为记录创建时的每个新记录的永久密钥值获取一个完全有效的新Guid。

Yet another huge benefit for using GUIDs for PKs is the fact that the value is indeed guaranteed unique - not just among all values generated by this server, but all values generated by all computers - whether it be your db server, web server, app server, or client machine. Pretty much every modern language has the capability of generating a valid guid now - in .NET you can use System.Guid.NewGuid. This is VERY handy when dealing with cached master-detail datasets in particular. You dont have to employ crazy temporary keying schemes just to relate your records together before they are committed. You just fetch a perfectly valid new Guid from the operating system for each new record's permanent key value at the time the record is created.

http://forums.asp。 net / t / 264350.aspx

这篇关于没有主键的表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆