为什么Postgres扫描一个巨大的表而不是使用我的索引？ [英] Why is Postgres scanning a huge table instead of using my index?

查看：143 发布时间：2018/8/2 15:25:10 sql performance postgresql indexing sql-execution-plan

本文介绍了为什么Postgres扫描一个巨大的表而不是使用我的索引？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我注意到我的一个SQL查询比我预期的慢得多，事实证明查询计划程序正在制定一个对我来说非常糟糕的计划。我的查询如下所示：

I noticed one of my SQL queries is much slower than I expected it to be, and it turns out that the query planner is coming up with a plan that seems really bad to me. My query looks like this:

select A.style, count(B.x is null) as missing, count(*) as total
  from A left join B using (id, type)
  where A.country_code in ('US', 'DE', 'ES')
  group by A.country_code, A.style
  order by A.country_code, total

B有一个（type，id）索引，和A有（country_code，style）索引。 A远小于B：A中的250K行与B中的100M行。

B has a (type, id) index, and A has a (country_code, style) index. A is much smaller than B: 250K rows in A vs 100M in B.

所以，我希望查询计划看起来像：

So, I expected the query plan to look something like:

使用A上的索引仅选择具有相应 country_code

（类型，id）找到匹配的行（如果有的话）

Group根据 country_code 和样式

添加计数

Use the index on A to select just those rows with appropriate country_code
Left join with B, to find the matching row (if any) based on its (type, id) index
Group things according to country_code and style
Add up the counts

但查询计划程序决定执行此操作的最佳方法是对B进行顺序扫描，然后对A进行右连接。我不能知道为什么会这样;有没有人有想法？这是它生成的实际查询计划：

But the query planner decides the best way to do this is a sequential scan on B, and then a right join against A. I can't fathom why that is; does anyone have an idea? Here's the actual query plan it generated:

 Sort  (cost=14283513.27..14283513.70 rows=171 width=595)
   Sort Key: a.country_code, (count(*))
   ->  HashAggregate  (cost=14283505.22..14283506.93 rows=171 width=595)
         ->  Hash Right Join  (cost=8973.71..14282810.03 rows=55615 width=595)
               Hash Cond: ((b.type = a.type) AND (b.id = a.id))
               ->  Seq Scan on b (cost=0.00..9076222.44 rows=129937844 width=579)
               ->  Hash  (cost=8139.49..8139.49 rows=55615 width=28)
                     ->  Bitmap Heap Scan on a  (cost=1798.67..8139.49 rows=55615 width=28)
                           Recheck Cond: ((country_code = ANY ('{US,DE,ES}'::bpchar[])))
                           ->  Bitmap Index Scan on a_country_code_type_idx  (cost=0.00..1784.76 rows=55615 width=0)
                                 Index Cond: ((country_code = ANY ('{US,DE,ES}'::bpchar[])))

编辑：根据对另一个问题的评论的线索，我尝试了将ENABLE_SEQSCAN设置为OFF; ，查询运行速度提高十倍。显然，我不想永久禁用顺序扫描，但这有助于确认我没有根据的猜测顺序扫描不是最好的可用计划。

following a clue from the comments on another question, I tried it with SET ENABLE_SEQSCAN TO OFF;, and the query runs ten times as fast. Obviously I don't want to permanently disable sequential scans, but this helps confirm my otherwise-baseless guess that the sequential scan is not the best plan available.

为什么Postgres扫描一个巨大的表而不是使用我的索引？ [英] Why is Postgres scanning a huge table instead of using my index?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

为什么Postgres扫描一个巨大的表而不是使用我的索引？ [英] Why is Postgres scanning a huge table instead of using my index?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭