Postgresql - 被索引删除的行 [英] Postgresql - Rows removed by Index
问题描述
有问题的表在 time
testdb=> explain analyze select avg(gl) from cdstest where time between 1407700790 and 1407711590;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1434716.75..1434716.76 rows=1 width=2) (actual time=20106.951..20106.952 rows=1 loops=1)
-> Bitmap Heap Scan on cdstest (cost=231261.49..1411280.42 rows=9374529 width=2) (actual time=811.495..10871.963 rows=9438824 loops=1)
Recheck Cond: (("time" >= 1407700790) AND ("time" <= 1407711590))
Rows Removed by Index Recheck: 204734
-> Bitmap Index Scan on timeindex (cost=0.00..228917.86 rows=9374529 width=0) (actual time=810.108..810.108 rows=9438824 loops=1)
Index Cond: (("time" >= 1407700790) AND ("time" <= 1407711590))
Total runtime: 20107.001 ms
(7 rows)
索引重新检查删除的行数:204734
- 这是什么意思?这似乎是一个相当随意的数字.
Rows Removed by Index Recheck: 204734
- What does this mean? This seems like a fairly arbitrary number.
给定时间范围之间的行数:
Number of rows between the given time range:
testdb=> select count(*) from cdstest where time between 1407700790 and 1407711590;
count
---------
9438824
(1 row)
该表包含约 6000 万行.
The table contains ~60million rows.
推荐答案
内部 Bitmap Index Scan
节点正在生成一个位图,将 1
放在所有记录的地方找到与您的搜索键匹配的,否则 0
.由于您的表很大,位图的大小越来越大,然后可用于此类操作的可用内存,通过 work_mem
,变小以保留整个位图.
The inner Bitmap Index Scan
node is producing a bitmap, putting 1
to all the places where records that match your search key are found, and 0
otherwise. As your table is quite big, the size of the bitmap is getting bigger, then available memory for these kind of operations, configured via work_mem
, becomes small to keep the whole bitmap.
当内存不足时,内部节点将开始生成 1
不是为记录,而是为已知包含匹配记录的块.这意味着,外部节点Bitmap Heap Scan
必须从该块中读取所有 记录并重新检查它们.很明显,会有一些不匹配的,它们的数量就是你看到的被索引重新检查删除的行
.
When in lack of a memory, inner node will start producing 1
not for records, but rather for blocks that are known to contain matching records. This means, that outer node Bitmap Heap Scan
has to read all records from such block and re-check them. Obiously, there'll be some non-matching ones, and their number is what you see as Rows Removed by Index Recheck
.
在即将推出的 9.4 中添加了一项新功能,报告 位图索引扫描
exact 和/或 lossy
页数> 节点.lossy
是你想要避免的.您可以查看有关此内容的更多信息 此处.
In the soon coming 9.4 a new feature is added, reporting how many exact
and/or lossy
pages where returned by the Bitmap Index Scan
node. lossy
are the ones you'd like to avoid. You can check more about this here.
最后,查阅您的 work_mem
设置并尝试增加它,仅针对此特定会话.我想,增加大约 40% 应该就足够了.
Finally, consult your work_mem
setting and try increasing it, just for this particular session.
I assume, that increasing by some 40% should be enough.
编辑
我这里运行的是 9.4beta3,所以我准备了一个小案例:
EDIT
I have 9.4beta3 running here, so I prepared a small show case:
DROP TABLE IF EXISTS tab;
SELECT id, id%10 mod
INTO tab
FROM generate_series(1,(1e7)::int) id;
CREATE INDEX i_tab_mod ON tab(mod);
VACUUM ANALYZE tab;
现在,我将 work_mem
设置为最小可能值并检查它:
Now, I set work_mem
to the minimal possible value and check it:
SET work_mem TO '64kB';
EXPLAIN (analyze, buffers)
SELECT * FROM tab WHERE mod=5;
EXPLAIN
提供以下 2 行:
Rows Removed by Index Recheck: 8896308
Heap Blocks: exact=510 lossy=43738
...
Execution time: 1356.938 ms
这意味着 64kB
可以容纳 510 个精确块.所以我在这里计算了总内存需求:
Which means, that 64kB
can hold 510 exact blocks. So I calculate the total memory requirement here:
new_mem_in_bytes = (work_mem_in_bytes / exact) * lossy
= (( 64.0 * 1024 / 510 ) * 43738) / 1024
= 5488.7kB
事实上,这不是计算所需内存的精确方法,但我认为它足以满足我们的需求.所以我尝试了 SET work_mem TO '5MB'
:
This is not precise approach to calculate needed memory, in fact, but I think it is good enough for our needs. So I tried with SET work_mem TO '5MB'
:
Heap Blocks: exact=44248
...
Execution time: 283.466 ms
这篇关于Postgresql - 被索引删除的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!