在WHERE子句中具有大量列的查询的索引 [英] What to index on queries with lots of columns in the WHERE clause

查看:126
本文介绍了在WHERE子句中具有大量列的查询的索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为公寓网站构建搜索引擎,我不知道如何索引公寓表。

Building a search engine for an apartment site and I'm not sure how to index the apartments table.

查询示例:


  • ... WHERE city_id = 1 AND size> 500 AND房间= 2

  • ... WHERE area_id = 2 AND ad_type ='agent'和价格BETWEEN 10000和14000

  • ... WHERE area_id = 2 OR area_id = 4 AND published_at> '2016-01-01'AND ad_type = 1

  • ...WHERE city_id = 1 AND size > 500 AND rooms = 2
  • ...WHERE area_id = 2 AND ad_type = 'agent' AND price BETWEEN 10000 AND 14000
  • ...WHERE area_id = 2 OR area_id = 4 AND published_at > '2016-01-01' AND ad_type = 1

如您所见,列可能会有所不同很多,WHERE子句中的列数可以达到10,甚至可能更多。

As you can see, the columns can vary a lot, and the number of columns in the WHERE clause can be up to 10, or possibly even more.


  • 我应该索引全部他们?

  • 只有最常见的?

推荐答案

你必须弄清楚你要用于此查询的 WHERE 子句,每种情况的发生频率以及每种条件的选择性。

You have to figure out what WHERE clauses you are going to use with this query, how often each will occur and and how selective each condition will be.


  • 除非必须,否则不要为很少发生的查询编制索引。

  • Don't index for queries that occur seldom unless you have to.

使用多列索引,从 = 比较中出现的那些列开始。

Use multicolumn indexes, starting with those columns that will occur in an = comparison.

关于多列索引中列的顺序,从那些将在查询中使用的列开始(索引可以用于仅包含其某些列的查询,前提是它们位于in的开头) dex)。

Concerning the order of columns in a multicolumn index, start with those columns that will be used in a query by themselves (an index can be used for a query with only some of its columns, provided they are at the beginning of the index).

您可能会忽略选择性较低的列,例如 gender

You might omit columns with low selectivity, like gender.

例如,对于您的上述查询,如果它们都是频繁的并且所有列都是选择性的,那么这些索引就会很好:

For example, with your above queries, if they are all frequent and all columns are selective, these indexes would be good:

... ON apartments (city_id, rooms, size)

... ON apartments (area_id, ad_type, price)

... ON apartments (area_id, ad_type, published_at)

这些索引也可以用于 WHERE 子句,只有 area_id city_id 在其中。

These indexes could also be used for WHERE clauses with only area_id or city_id in them.

索引太多是不好的。

如果以上方法会导致索引太多,例如因为用户可以为 WHERE 子句选择任意列,所以最好对各个列进行索引,或者偶尔对经常组合在一起的列进行索引。

If the above method would lead to too many indexes, e.g. because the user can pick arbitrary columns for the WHERE clause, it is better to index individual columns or occasionally pairs of columns that regularly go together.

这样PostgreSQL可以选择一个位图索引扫描来组合一个查询的多个索引。这比常规索引扫描效率低,但通常优于顺序扫描

That way PostgreSQL can pick a bitmap index scan to combine several indexes for one query. That is less efficient than a regular index scan, but usually better than a sequential scan.

这篇关于在WHERE子句中具有大量列的查询的索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆