自然键与代理键一个innodb外键 [英] natural key vs surrogate key an innodb foreign key

查看:37
本文介绍了自然键与代理键一个innodb外键的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一个问题:

我有 2 张桌子:

Product
id INT
name VARCHAR(64)
something TEXT
else INT
entirely BOOL

Ingredient
id INT
name VARCHAR(64)
description TEXT

现在我也有一个链接表

Products_Ingredients
product_id INT
ingredient_id INT

对于我的多对多关系.

现在产品和成分都将拥有唯一的名称.所以我可以使用名称作为自然键......但这会是个好主意吗?

Now both products and ingredients will have unique names. So I can use names as natural keys... however will that be a good idea?

假设我有一个产品:Paint Thinner Supreme含成分:丁基硝基四环素

Say I have a product: Paint Thinner Supreme with ingredient: Butylonitrotetrocycline

将这些名称用作链接表中的复合键是个好主意吗?

Will that be a good idea to use those names as composite key in the link table?

尽管我了解在代理项上使用自然键背后的想法,但我还是忍不住想,使用简单整数作为主键(和外键)会快得多.MySQL 服务器消化这些不同密钥的方式会有所不同吗?

As much as I understand idea behind using natural keys over the surrogates, I kinda can't stop thinking that using simple integers as primary keys (and foreign ones) will be much faster. Will there be a difference in a way in which MySQL server digests those different keys?

你的意见是什么?

推荐答案

什么时候可以衡量,意见并不重要.

Opinions don't matter when you can measure.

我使用自然键和代理在 PostgreSQL 上实现了这一点.我总共使用了 300,000 种产品、180 种成分,并为 100,000 种随机选择的产品(1053462 行)填充了两个产品成分"表,每种产品包含 3 到 17 种成分.

I implemented this on PostgreSQL using both natural keys and surrogates. I used 300,000 total products, 180 ingredients, and populated two "product ingredient" tables with 3 to 17 ingredients per product, for 100,000 randomly selected products (1053462 rows).

使用自然键选择单个产品的所有成分,返回时间为 0.067 毫秒.使用代理,0.199ms.

Selecting all the ingredients for a single product using natural keys returned in 0.067 ms. Using surrogates, 0.199ms.

使用在 0.145 毫秒内返回的自然键返回单个产品的所有非 ID 列.使用代理,0.222 ms

Returning all the non-id columns for a single product using natural keys returned in 0.145 ms. Using surrogates, 0.222 ms

所以自然键在这个数据集上的速度大约快 2 到 3 倍.

So natural keys were about 2 to 3 times faster on this data set.

自然键不需要任何连接即可返回此数据.代理键需要两个连接.

Natural keys don't require any joins to return this data. Surrogate keys require two joins.

实际的性能差异取决于表的宽度、行数、页面大小和名称的长度,等等.会有一个点,代理键开始优于自然键,但很少有人尝试衡量这一点.

The actual performance difference depends on the width of your tables, number of rows, page size, and length of names, and things like that. There will be a point where surrogate keys start outperforming natural keys, but few people try to measure that.

当我为雇主的运营数据库设计数据库时,我构建了一个测试平台,其中包含围绕自然键设计的表格和围绕 ID 数字设计的表格.这两种模式都有超过 1300 万行计算机生成的样本数据.在少数情况下,对 id 编号模式的查询比自然键模式高出 50%.(因此,使用 id 号花费 20 秒的复杂查询使用自然键需要 30 秒.)但是 80% 的测试查询对自然键模式具有更快的 SELECT 性能.有时它的速度快得惊人——相差 30 比 1.

When I was designing the database for my employer's operational database, I built a testbed with tables designed around natural keys and with tables designed around id numbers. Both those schemas have more than 13 million rows of computer-generated sample data. In a few cases, queries on the id number schema outperformed the natural key schema by 50%. (So a complex query that took 20 seconds with id numbers took 30 seconds with natural keys.) But 80% of the test queries had faster SELECT performance against the natural key schema. And sometimes it was staggeringly faster--a difference of 30 to 1.

我们预计在未来几年内,自然键的性能将优于我们数据库中的代理.(除非我们将某些表移至 SSD,在这种情况下,自然键可能永远胜过代理.)

We expect natural keys to outperform surrogates in our database for years to come. (Unless we move certain tables over to an SSD, in which case natural keys will probably outperform surrogates forever.)

这篇关于自然键与代理键一个innodb外键的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆