自然键vs代理键一个innodb外键 [英] natural key vs surrogate key an innodb foreign key

查看:380
本文介绍了自然键vs代理键一个innodb外键的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一个问题:



我有两个表:

 产品
id INT
name VARCHAR(64)
something TEXT
else INT
完全BOOL

 成分
id INT
名称VARCHAR(64)
description TEXT

现在我还有一个链接表

  Products_Ingredients 
product_id INT
ingredient_id INT

为我的多对多关系。



现在,产品和配料都有唯一的名称。所以我可以使用名称作为自然键...但这将是一个好主意?



说我有一个产品: Paint Thinner Supreme
含有成分: butylonitrotetrocycline



在链接表中使用这些名称作为复合键是一个好主意吗?



尽管我了解使用自然键超越代理的想法,但我不能停止使用简单的整数作为主键(和外部键)快多了。



你的意见是什么?



我在PostgreSQL上使用自然键和代理来实现这个功能。我使用30万总产品,180种成分,并填充两个产品成分表,每个产品3到17个成分,100,000个随机选择的产品(1053462行)。



使用自然键选择单个产品的所有成分,返回0.067毫秒。使用代理,0.199毫秒。



使用在0.145 ms内返回的自然键返回单个产品的所有非ID列。使用代理,0.222毫秒



所以自然键比这个数据集快大约2到3倍。



自然键不需要任何连接来返回此数据。代理键需要两个连接。



实际的性能差异取决于表的宽度,行数,页面大小和名称长度等。将有一个点,其中代理键开始超过自然键,但很少有人试图测量。



当我为我的雇主的操作数据库设计数据库时,我使用围绕自然密钥设计的表以及围绕id号设计的表构建了一个测试平台。这两个模式都有超过1300万行计算机生成的样本数据。在少数情况下,对id号模式的查询优于自然键模式50%。 (因此,使用id数字花费20秒的复杂查询需要30秒的自然键)。但是80%的测试查询比自然键模式具有更快的SELECT性能。有时候它的速度快得惊人 - 差别为30比1。



我们预计自然钥匙在未来的数据库中将超越代理。 (除非我们将某些表移动到SSD,在这种情况下,自然键可能永远胜过代替)。


A question:

I have 2 tables:

Product
id INT
name VARCHAR(64)
something TEXT
else INT
entirely BOOL

and

Ingredient
id INT
name VARCHAR(64)
description TEXT

Now I also have a link table

Products_Ingredients
product_id INT
ingredient_id INT

for my many to many relation.

Now both products and ingredients will have unique names. So I can use names as natural keys... however will that be a good idea?

Say I have a product: Paint Thinner Supreme with ingredient: Butylonitrotetrocycline

Will that be a good idea to use those names as composite key in the link table?

As much as I understand idea behind using natural keys over the surrogates, I kinda can't stop thinking that using simple integers as primary keys (and foreign ones) will be much faster. Will there be a difference in a way in which MySQL server digests those different keys?

What is your opinion?

解决方案

Opinions don't matter when you can measure.

I implemented this on PostgreSQL using both natural keys and surrogates. I used 300,000 total products, 180 ingredients, and populated two "product ingredient" tables with 3 to 17 ingredients per product, for 100,000 randomly selected products (1053462 rows).

Selecting all the ingredients for a single product using natural keys returned in 0.067 ms. Using surrogates, 0.199ms.

Returning all the non-id columns for a single product using natural keys returned in 0.145 ms. Using surrogates, 0.222 ms

So natural keys were about 2 to 3 times faster on this data set.

Natural keys don't require any joins to return this data. Surrogate keys require two joins.

The actual performance difference depends on the width of your tables, number of rows, page size, and length of names, and things like that. There will be a point where surrogate keys start outperforming natural keys, but few people try to measure that.

When I was designing the database for my employer's operational database, I built a testbed with tables designed around natural keys and with tables designed around id numbers. Both those schemas have more than 13 million rows of computer-generated sample data. In a few cases, queries on the id number schema outperformed the natural key schema by 50%. (So a complex query that took 20 seconds with id numbers took 30 seconds with natural keys.) But 80% of the test queries had faster SELECT performance against the natural key schema. And sometimes it was staggeringly faster--a difference of 30 to 1.

We expect natural keys to outperform surrogates in our database for years to come. (Unless we move certain tables over to an SSD, in which case natural keys will probably outperform surrogates forever.)

这篇关于自然键vs代理键一个innodb外键的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆