为PostgreSQL查询选择正确的索引 [英] Chosing the right index for PostgreSQL query

查看:896
本文介绍了为PostgreSQL查询选择正确的索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

简化表:

CREATE TABLE products (
product_no integer PRIMARY KEY,
sales integer,
status varchar(16),
category varchar(16));

CREATE INDEX index_products_sales ON products (sales);
CREATE INDEX index_products_status ON products (status);
CREATE INDEX index_products_category ON products (category);

PostgreSQL的版本是8.4。 状态和类别列

PostgreSQL version is 8.4. Columns 'status' and 'category'

有2000万种产品/行分布在15个类别中。

There are 20 million products/rows spread across 15 categories.

使用最频繁的查询之一是获取销量最高的三个产品,不包括 cat3和 cat7类别中的产品:

One of the most used queries is getting the three most sold products, excluding products in categories 'cat3' and 'cat7':

SELECT product_no, sales 
FROM products 
WHERE status = 'something' AND category NOT IN ('cat3', 'cat7') 
ORDER BY sales DESC 
LIMIT 3;

Limit  (cost=0.00..8833.39 rows=3 width=12) (actual time=9235.332..9356.284 rows=3 loops=1)
   ->  Index Scan using index_products_sales on products  (cost=0.00..68935806.85 rows=23412 width=12) (actual time=9235.327..9356.278 rows=3 loops=1)
     Filter: (((category)::text <> ALL ('{cat3,cat7}'::text[])) AND ((status)::text = 'something'::text))

使此特定查询运行更快的最佳索引是什么?

What would be the best index for making this specific query run faster?

推荐答案

使用以下特定排序顺序创建部分多列索引:

Create a partial, multicolumn index with this particular sort order:

CREATE INDEX products_status_sales_partial_idx ON products (status, sales DESC)
WHERE  category NOT IN ('cat3','cat7');

稍微修改您的查询:

SELECT product_no, sales 
FROM   products 
WHERE  status = 'something'
AND    category NOT IN ('cat3', 'cat7') 
ORDER  BY status, sales DESC 
LIMIT  3;

首先添加状态 ORDER BY 子句的元素似乎多余且毫无意义。但请尝试一下。

Adding status as first element of the ORDER BY clause seems redundant and pointless. But give it a try.

查询计划器的智能不足以理解

The query planner is not smart enough to understand, that with

WHERE  status = 'something' ...
ORDER  BY sales DESC

索引(状态,销售DESC)的排序顺序匹配。因此,它将读取所有个符合条件的行,并排序并选择前3个。

the sort order of the index (status, sales DESC) matches as a logical consequence. So it is going to read all qualifying rows, sort and pick the top 3.

通过添加状态 ORDER BY ,您可以使查询计划程序直接从索引中读取前3个条目。预计会加快几个数量级

By adding status to the ORDER BY you enable the query planner to read the top 3 entries from the index directly. Expect a speed-up by several orders of magnitude.

已通过PostgreSQL 8.4和9.1测试。

Tested with PostgreSQL 8.4 and 9.1.

这篇关于为PostgreSQL查询选择正确的索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆