postgresql 中多列上的多个索引与单个索引 [英] Multiple indexes vs single index on multiple columns in postgresql

查看:172
本文介绍了postgresql 中多列上的多个索引与单个索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

阅读有关此主题的一些现有帖子时,我无法得出任何结论性的答案.

I could not reach any conclusive answers reading some of the existing posts on this topic.

我有过去 10 年中 100 个地点的某些数据.该表有大约 8 亿行.我需要主要为每个位置生成年度统计数据.有时我还需要生成每月变化统计数据和每小时变化统计数据.我想知道我是否应该生成两个索引 - 一个用于位置,另一个用于年份,或者在位置和年份上生成一个索引.我的主键目前是一个序列号(可能我可以使用位置和时间戳作为主键).

I have certain data at 100 locations the for past 10 years. The table has about 800 million rows. I need to primarily generate yearly statistics for each location. Some times I need to generate monthly variation statistics and hourly variation statistics as well. I'm wondering if I should generate two indexes - one for location and another for year or generate one index on both location and year. My primary key currently is a serial number (Probably I could use location and timestamp as the primary key).

谢谢.

推荐答案

无论您在关系上创建了多少个索引,在某个查询中只会使用其中一个索引(哪个取决于查询、统计信息等).因此,在您的情况下,您不会从创建两个单列索引中获得累积优势.为了从索引中获得最大性能,我建议对(位置、时间戳)使用复合索引.

Regardless of how many indices have you created on relation, only one of them will be used in a certain query (which one depends on query, statistics etc). So in your case you wouldn't get a cumulative advantage from creating two single column indices. To get most performance from index I would suggest to use composite index on (location, timestamp).

注意,像 ... WHERE timestamp BETWEEN smth AND smth 这样的查询不会使用上面的索引,而像 ... WHERE location = 'smth'... WHERE location = 'smth' AND timestamp BETWEEN smth AND smth 将.这是因为索引中的第一个属性对于搜索和排序至关重要.

Note, that queries like ... WHERE timestamp BETWEEN smth AND smth will not use the index above while queries like ... WHERE location = 'smth' or ... WHERE location = 'smth' AND timestamp BETWEEN smth AND smth will. It's because the first attribute in index is crucial for searching and sorting.

别忘了表演

ANALYZE;

在创建索引之后以收集统计信息.

after index creation in order to collect statistics.

更新:正如 @MondKin 在评论中提到的,某些查询实际上可以在同一关系上使用多个索引.例如,使用 OR 子句查询,如 a = 123 OR b = 456(假设两列都有索引).在这种情况下,postgres 将对两个索引执行位图索引扫描,构建结果位图的联合并将其用于位图堆扫描.在某些情况下,相同的方案可用于 AND 查询,但会出现交集而不是联合.

Update: As @MondKin mentioned in comments certain queries can actually use several indexes on the same relation. For example, query with OR clauses like a = 123 OR b = 456 (assuming that there are indexes for both columns). In this case postgres would perform bitmap index scans for both indexes, build a union of resulting bitmaps and use it for bitmap heap scan. In certain conditions the same scheme may be used for AND queries but instead of union there would be an intersection.

这篇关于postgresql 中多列上的多个索引与单个索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆