什么是最佳实践和“经验法则”。用于创建数据库索引? [英] What are some best practices and "rules of thumb" for creating database indexes?
问题描述
我有一个应用程序,它遍历数据库表中的大量记录,并对该数据库中的记录执行大量SQL和.Net操作(目前我在PostgreSQL上使用Castle.ActiveRecord)。
I have an app, which cycles through a huge number of records in a database table and performs a number of SQL and .Net operations on records within that database (currently I am using Castle.ActiveRecord on PostgreSQL).
我在几个字段上添加了一些基本的btree索引,正如您所料,SQL操作的性能大幅提升。想要充分利用dbms的性能,我想对我应该在所有项目中编制索引做出更好的选择。
I added some basic btree indexes on a couple of the feilds, and as you would expect, the performance of the SQL operations increased substantially. Wanting to make the most of dbms performance I want to make some better educated choices about what I should index on all my projects.
我知道性能有所下降在进行插入时(因为数据库需要更新索引以及数据),但是在创建数据库索引时应该考虑哪些建议和最佳实践?如何最好地为一组数据库索引(经验法则)选择字段/字段组合?
I understand that there is a detrement to performance when doing inserts (as the database needs to update the index, as well as the data), but what suggestions and best practices should I consider with creating database indexes? How do I best select the feilds/combination of fields for a set of database indexes (rules of thumb)?
此外,如何最好地选择要使用的索引聚集索引?当谈到访问方法时,我应该在什么条件下使用btree而不是哈希或gist或杜松子酒(无论如何呢?)。
Also, how do I best select which index to use as a clustered index? And when it comes to the access method, under what conditions should I use a btree over a hash or a gist or a gin (what are they anyway?).
推荐答案
我的一些经验法则:
- 索引所有主键(我认为大多数RDBMS都是当创建表时这个。)
- 索引所有外键列。
- 仅在以下情况下创建更多索引:
- 查询很慢。
- 您知道数据量会大幅增加。
- Index ALL primary keys (I think most of the RDBMS do this when table is created).
- Index ALL foreign keys columns.
- Create more indexes ONLY if:
- Queries are slow.
- You know the data volume are going to increase significantly.
如果查询速度慢,请查找执行计划和:
If a query is slow, look for the execution plan and:
- 如果表的查询只使用很少的列将所有列放入索引,那么你可以帮助RDBMS只使用索引。
- 不要浪费资源索引微小的表(数百条记录)。
- 按顺序索引多列m高基数到更少。这意味着,首先是具有更多不同值的列,然后是具有更少不同值的列。
- 如果查询需要访问超过10%的数据,则通常全扫描优于索引。
- If the query for a table only uses few columns put all that columns into an index, then you can help the RDBMS to use only the index.
- Don't waste resources indexing tiny tables (hundreds of records).
- Index multiple columns in order from high cardinality to less. It means, first the columns with more distinct values followed by columns with fewer distinct values.
- If a query needs to access more than 10% of the data, normaly a full scan is better than an index.
这篇关于什么是最佳实践和“经验法则”。用于创建数据库索引?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!