PostgreSQL Citext索引与较低的表达式索引性能 [英] PostgreSQL citext index vs lower expression index performance

查看:467
本文介绍了PostgreSQL Citext索引与较低的表达式索引性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想决定使用带有索引的 citext 列还是使用带有索引的 text lower()

I want to decide between using a citext column with an index or using a text column with an index on lower().

我执行了一些基准测试。令我惊讶的是,使用 lower()上的索引进行搜索会导致索引扫描,但是在 citext 的情况下,仅索引扫描。我期望 lower()上的索引也导致索引也只扫描。

I performed some benchmarks. To my surprise the search with the index on lower() causes an index scan, but in the citext case I get an index only scan. I was expecting the index on lower() to cause an index only scan too.

另外,总成本索引为 citext 的索引为4.44,而索引为 lower()的索引的总费用为8.44。

Also, the total cost with the citext index is 4.44, but with the index on lower(), the total cost is 8.44.

因此,我首先想到的是 citext 列索引比 text 列。

So the first thing coming to my mind is that the citext column index is better than the expression index on a text column.

CREATE TABLE test_citext(a citext NOT NULL);

INSERT INTO test_citext
   SELECT cast(x as text)
   FROM generate_series(1, 1000000) AS x;

VACUUM (FREEZE, ANALYZE) test_citext;

create index citextind on test_citext(a);

Select * from test_citext where a = 'test';
--Index Only Scan.Total cost 4.44

CREATE TABLE test_textlowerindex(a text NOT NULL);

INSERT INTO test_textlowerindex
   SELECT cast(x as text)
   FROM generate_series(1, 1000000) AS x;

VACUUM (FREEZE, ANALYZE) test_textlowerindex;

create index lowertextind on test_textlowerindex(lower(a));

Select * from test_textlowerindex where lower(a) = 'test';
--Index Scan.Total cost 8.44

我对吗?

Laurenz先生先生,谢谢您的回答。我如您所说更改了我的上述脚本。
结果:

Mister Laurenz Albe thanks for your answer.I changed my above script as you said. Result :

CREATE TABLE test_citext(a citext NOT NULL);

INSERT INTO test_citext
   SELECT cast(x as text)
   FROM generate_series(1, 1000000) AS x;

create index citextind on test_citext(a);

VACUUM (FREEZE, ANALYZE) test_citext;

Select count(*) from test_citext where a = 'test';
--Index Only Scan 4.44 + 4.46

CREATE TABLE test_textlowerindex(a text NOT NULL);

INSERT INTO test_textlowerindex
   SELECT cast(x as text)
   FROM generate_series(1, 1000000) AS x;

create index lowertextind on test_textlowerindex(lower(a));

VACUUM (FREEZE, ANALYZE) test_textlowerindex;

Select count(*) from test_textlowerindex where lower(a) = 'test';
--Index Scan 8.44 + 8.46



但是并没有改变,即使我在创建索引并在select中使用count(*)之后运行分析。IndexScan仍会继续在Lower()上进行索引。


But anything did not changed Even If I run analyze after creating index and using count(*) in select.Index Scan still continues with an index on lower().

推荐答案

您的测试具有误导性。这里有两个问题:

Your test is misleading. There are two problems here:


  1. 您没有运行 ANALYZE 在创建索引 lowertextind 之后。

否则,PostgreSQL不知道如何降低 lower( a)是分布式的,可能会产生错误的成本估算。

Without that, PostgreSQL doesn't know how lower(a) is distributed and will probably produce a wrong cost estimate.

通过使用 SELECT * 您无意中允许将仅索引扫描用于第一个查询,但不能用于第二个查询。这是因为第一个索引包含所有表列,但第二个索引不包含。

By using SELECT * you inadvertently allowed an index only scan to be used for the first query, but not for the second. This is because the first index contains all table columns, but the second doesn't.

因为第二个索引不包含 a ,必须从表中获取该值,这会导致其他工作。

Since the second index doesn't contain a, the value has to be fetched from the table, causing additional work.

您可以使用 SELECT count(*)FROM ... 以获得更公平的基准。

You could use SELECT count(*) FROM ... for a fairer benchmark.

这篇关于PostgreSQL Citext索引与较低的表达式索引性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆