我如何知道何时索引一个列,以及什么? [英] How do I know when to index a column, and with what?

查看:198
本文介绍了我如何知道何时索引一个列,以及什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在各种ORM的文档中,他们总是提供一种创建索引的方法。他们总是提到要确保为效率创建适当的索引,就像这是一个非手写SQLer的固有知识,需要使用ORM。我对索引的理解(在PK之外)基本上是:如果你计划根据列的内容做 LIKE 查询(即搜索),你应该使用一个完整的该列的文本索引。关于索引我还应该知道什么(主要是关于效率)?我觉得在我的门口有一个知识的世界,但有一个巨大的折叠鼠标垫卡在它下面,所以我不能通过(我不知道为什么我觉得我需要说,但是,感谢您提供沙发)。

In docs for various ORMs they always provide a way to create indexes, etc. They always mention to be sure to create the appropriate indexes for efficiency, as if that is inherent knowledge to a non-hand-written-SQLer who needs to use an ORM. My understanding of indexes (outside of PK) is basically: If you plan to do LIKE queries (ie, search) based on the contents of a column, you should use a full text index for that column. What else should I know regarding indexes (mostly pertaining to efficiency)? I feel like there is a world of knowledge at my door step, but there's a huge folded mouse pad jammed up under it, so I can't get through (I don't know why I felt like I needed to say that, but thanks for providing the couch).

推荐答案

想象一个索引非常类似于一本书背面的索引。这是一个完全独立的领域,从书的内容,如果你正在寻找一些具体的价值,你可以去索引,查找它(索引是有序的,所以找到的东西比扫描书的每一页都快得多)。

Think of an index very roughly like the index in the back of a book. It's a totally separate area from the content of the book, where if you are seeking some specific value, you can go to the index and look it up (indexes are ordered, so finding things there is much quicker than scanning every page of the book).

索引条目有一个页码,因此您可以快速转到寻找您的主题的页面。数据库索引非常相似;它是数据库中相关信息(索引中包含的字段)的有序列表,其中包含数据库的信息以查找匹配的记录。

The index entry has a page number, so you can then quickly go to the page seeking your topic. A database index is very similar; it is an ordered list of the relevant information in your database (the field(s) included in the index), with information for the database to find the records which match.

所以...当您有需要频繁搜索的信息时,您将创建一个索引。正常索引不能帮助像'LIKE查询'这样的'partial'查询,但是任何时候你需要得到一组结果,其中字段X具有一定的值,它们使DBMS不需要扫描整个表,寻找匹配的值。

So... you would create an index when you have information that you need to search on frequently. Normal indexes don't help you for 'partial' seeks like LIKE queries, but any time you need to get a set of results where field X has certain value(s), they keep the DBMS from needing to 'scan' the whole table, looking for matching values.

还需要对列进行排序时,它们也有帮助。

They also help when you need to sort on a column.

要牢记;如果DBMS允许创建具有多个字段的单个索引,请确保调查这样做的影响,特定于DBMS。包含多个字段的索引可能只有在所有这些字段都在查询中使用时才是完全(或全部)有用的。相反,对于单个表具有多个索引,每个索引具有一个字段,对于通过多个字段过滤/排序的查询可能没有太多(或任何)帮助。

Another thing to keep in mind; If the DBMS allows you to create single indexes that have multiple fields, be sure to investigate the effects of doing so, specific to your DBMS. An index that includes multiple fields is likely only to be fully (or at all) useful if all those fields are being used in a query. Conversely, having multiple indexes for a single table, with one field per index, may not be of much (or any) help for queries that are filtering/sorting by multiple fields.

您提到了全文索引和PK(主键)。

You mentioned Full Text indexes and PKs (Primary Keys). These are different than regular indexes, though they often serve similar purposes.

首先,请注意,主键通常是一个索引(在MSSQL中是一个聚簇索引事实上),但这不需要具体的情况。例如,默认情况下,MSSQL PK是一个聚簇索引;聚簇索引的特殊之处在于它们不是存储在别处的单独数据位,但数据本身按聚簇索引的顺序排列在表中。这就是为什么流行的PK是一个 int 值,它是使用顺序增加的值自动生成的。因此,聚簇索引特别通过字段的值对表中的数据进行排序。将此与传统字典进行比较;

First, note that a Primary Key is usually an index (in MSSQL, a 'Clustered Index', in fact), but this does not need to be the case specifically. As an example, an MSSQL PK is a Clustered Index by default; clustered indexes are special in that they are not a separate bit of data stored elsewhere, but the data itself is arranged in the table in order by the Clustered Index. This is why a popular PK is an int value that is auto-generated with sequential, increasing values. So, a Clustered Index sorts the data in the table specifically by the field's value. Compare this to a traditional dictionary; the entries themselves are ordered by the 'key', which is the word being defined.

但是在MSSQL中(请检查DBMS文档中的信息),您可以更改聚簇索引是一个不同的字段,如果你喜欢。有时这是在 datetime 的基础上完成的。

But in MSSQL (check your DBMS documentation for your information), you can change the Clustered Index to be a different field, if you like. Sometimes this is done on datetime based fields.

索引是完全不同种类的野兽。他们使用一些相同的原则,但他们正在做的是不完全相同的正常索引,我描述。另外:在某些DBMS中, LIKE 查询不会 使用全文索引;需要特殊的查询运算符。

Full Text indexes are different kinds of beasts entirely. They use some of the same principles, but what they are doing isn't exactly the same as normal indexes, which I am describing. Also: in some DBMS's, LIKE queries do not use the full text index; special query operators are required.

这些索引是不同的,因为它们的意图不是对列的整个值进行查找/排序(数字,日期,短

These indexes are different because their intent is not to find/sort on the whole value of the column (a number, a date, a short bit of char data), but instead to find individual words/phrases within the text field(s) being indexed.

他们也可以经常启用搜索相似的单词,不同的时态,常见拼写错误等,通常忽略噪声字。他们工作的不同方式是为什么他们也可能需要不同的操作员来使用它们。 (再次检查您的本地文档中的DBMS!)

They can also often enable searching for similar words, different tenses, common misspellings and the like, and typically ignore noise words. The different way in which they work is why they also may need different operators to use them. (again, check your local documentation for your DBMS!)

这篇关于我如何知道何时索引一个列,以及什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆