ActiveRecord 中多列的索引 [英] Index for multiple columns in ActiveRecord

查看:21
本文介绍了ActiveRecord 中多列的索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 ActiveRecord 中,有两种方法可以为多列声明索引:

In ActiveRecord there are two ways to declare indexes for multiple columns:

add_index :classifications, [:species, :family, :trivial_names]

add_index :classifications, :species
add_index :classifications, :family
add_index :classifications, :trivial_names

第一种方法和第二种方法有什么区别吗?如果是这样,我什么时候应该用第一个,什么时候用第二个?

Is there any difference between the first approach and the second one? If so, when should I use the first and when the second?

推荐答案

您正在将复合索引与一组独立索引进行比较.他们只是不同.

You are comparing a composite index with a set of independent indices. They are just different.

这样想:复合索引使您可以快速查找嵌套字段集中的第一个字段,然后快速查找第二个字段仅在第一个字段已选择的记录中字段,然后快速查找第三个字段 - 再次,仅在前两个索引选择的记录内.

Think of it this way: a compound index gives you rapid look-up of the first field in a nested set of fields followed by rapid look-up of the second field within ONLY the records already selected by the first field, followed by rapid look-up of the third field - again, only within the records selected by the previous two indices.

让我们举个例子.您的数据库引擎将在 1,000,000 条记录(如果没有记错的话)如果中找到唯一值,只需 20 个步骤即可.如果您正在使用索引.无论您使用的是复合索引还是独立索引,这都是正确的 - 但仅适用于第一个字段(在您的示例中为物种",尽管我认为您需要家族、物种和通用名称).

Lets take an example. Your database engine will take no more than 20 steps to locate a unique value within 1,000,000 records (if memory serves) if you are using an index. This is true whether you are using a composite or and independent index - but ONLY for the first field ("species" in your example although I'd think you'd want Family, Species, and then Common Name).

现在,假设第一个字段值有 100,000 条匹配记录.如果您只有单个索引,那么在这些记录中的任何查找都需要 100,000 个步骤:第一个索引检索到的每条记录都需要一个步骤.这是因为不会使用第二个索引(在大多数数据库中 - 这有点简化)并且必须使用蛮力匹配.

Now, let's say that there are 100,000 matching records for this first field value. If you have only single indices, then any lookup within these records will take 100,000 steps: one for each record retrieved by the first index. This is because the second index will not be used (in most databases - this is a bit of a simplification) and a brute force match must be used.

如果您有一个复合索引,那么您的搜索速度会快得多,因为您的第二个字段搜索将在第一组值中有一个索引.在这种情况下,在字段 1 的 100,000 个匹配项(100,000 的日志基数 2)中,您只需执行 17 个步骤即可在字段 2 上找到第一个匹配值.

If you have a composite index then your search is much faster because your second field search will have an index within the first set of values. In this case you'll need no more than 17 steps to get to your first matching value on field 2 within the 100,000 matches on field 1 (log base 2 of 100,000).

因此:使用 3 个嵌套字段的复合索引从 1,000,000 条记录的数据库中查找唯一记录所需的步骤,其中第一个检索 100,000,第二个检索 10,000 = 20 + 17 + 14 = 51 个步骤.

So: steps needed to find a unique record out of a database of 1,000,000 records using a composite index on 3 nested fields where the first retrieves 100,000 and the second retrieves 10,000 = 20 + 17 + 14 = 51 steps.

在相同条件下所需的步数仅为独立索引 = 20 + 100,000 + 10,000 = 110,020 步.

Steps needed under the same conditions with just independent indices = 20 + 100,000 + 10,000 = 110,020 steps.

差别很大吧?

现在,不要把复合指数放在任何地方.首先,它们在插入和更新上很昂贵.其次,只有在您真正搜索嵌套数据时才会使用它们(例如,我在为给定日期范围内的客户端提取数据时使用它们).此外,如果您正在处理相对较小的数据集,它们也不值得.

Now, don't go nuts putting composite indices everywhere. First, they are expensive on inserts and updates. Second, they are only brought to bear if you are truly searching across nested data (for another example, I use them when pulling data for logins for a client over a given date range). Also, they are not worth it if you are working with relatively small data sets.

最后,检查您的数据库文档.如今,数据库在部署索引的能力方面变得极其复杂,而我上面描述的 Database 101 场景可能不适用于某些人(尽管我总是像这样开发,只是为了让我知道我得到了什么).

Finally, check your database documentation. Databases have grown extremely sophisticated in the ability to deploy indices these days and the Database 101 scenario I described above may not hold for some (although I always develop as if it does just so I know what I am getting).

这篇关于ActiveRecord 中多列的索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆