未使用smallint []列的GIN索引或错误“运算符不唯一” [英] GIN index on smallint[] column not used or error "operator is not unique"
问题描述
create table test(
id serial primary key,
tagged smallint[]
);
标记
列上有杜松子酒索引,使用 _int2_ops
运算符类:
There is gin index on tagged
column, with _int2_ops
operator class:
CREATE INDEX ix ON测试使用GIN(col _int2_ops) ;
当我运行此查询时:
select * from test
where tagged @> ARRAY[11]
order by id limit 100;
EXPLAIN ANALYZE
显示:
Limit (cost=0.43..19524.39 rows=100 width=36) (actual time=25024.124..25027.263 rows=100 loops=1)
-> Index Scan using test_pkey on test (cost=0.43..508404.37 rows=2604 width=36) (actual time=25024.121..25027.251 rows=100 loops=1)
Filter: ((tagged)::integer[] @> '{11}'::integer[])
Rows Removed by Filter: 2399999
Planning time: 6.912 ms
Execution time: 25027.307 ms
大胆强调我的。为什么标记的
列转换为整数[]
类型?我认为这就是为什么不使用GIN索引并且查询运行缓慢的原因。
Bold emphasis mine. Why is the tagged
column converted to integer[]
type? I think this is the reason why GIN the index isn't used and the query runs slow.
我试过 WHERE标记@> ARRAY [11] :: smallint []
但出现此错误:
operator is not unique: smallint[] @> smallint[]
如果我这样做但使用标记为int []
并创建索引为
If I do the same but use tagged int[]
and create index as
CREATE INDEX ix ON test USING GIN(tagged gin__int_ops);
然后上面的查询使用GIN索引:
then the above query uses the GIN index:
"-> Bitmap Index Scan on ix (cost=0.00..1575.53 rows=2604 width=0) (actual time=382.840..382.840 rows=2604480 loops=1)"
" Index Cond: (tagged @> '{11}'::integer[])"
这比以前快一点,但平均需要10秒 - 仍然太慢。我想尝试 smallint []
而不是 int []
,也许这会更快......
This is a bit faster than previous, but It takes on average 10 second - still too slow. I want to try smallint[]
instead of int[]
, maybe that will be faster ...
推荐答案
解决方案
最有可能的是,解决方案是对运算符进行模式限定:
Solution
Most probably, the solution is to schema-qualify the operator:
SELECT *
FROM test
WHERE tagged OPERATOR(pg_catalog.@>) '{11}'::int2[]
ORDER BY id
LIMIT 100;
为什么?
这是运算符解析的问题(与类型解析和强制转换上下文相结合)。
Why?
It's a problem of operator resolution (in combination with type resolution and cast context).
在标准的Postgres中,只有一个候选运算符 anyarray @> anyarray
,这就是你想要的那个。
In standard Postgres, there is only a single candidate operator anyarray @> anyarray
, that's the one you want.
如果你没有安装附加模块 intarray (我的假设),它提供另一个运营商整数[] @>整数[]
。
因此,另一个解决方案是使用 integer [] $ c相反,并使用
gin__int_ops
运算符类。或者尝试(默认为intarray) gist__int_ops
索引。要么可能更快,但两者都不允许NULL值。
或者您可以重命名 intarray
运算符 @>
消除歧义。 (我不会这样做。随后会出现升级和可移植性问题。)
Hence, another solution would be to use integer[]
instead and have a GIN index with the gin__int_ops
operator class. Or try the (default for intarray) gist__int_ops
index. Either might be faster, but both don't allow NULL values.
Or you could rename the intarray
operator @>
to disambiguate. (I would not do that. Upgrade and portability issues ensue.)
对于涉及至少一个类型整数的操作数的表达式[ ]
,Postgres知道要选择哪个运算符:intarray运算符。但是 索引不适用 ,因为intarray运算符只能在整数
上运行( int4
)不是 int2
。索引严格绑定到运营商:
For expressions involving at least one operand of type integer[]
, Postgres knows which operator to pick: the intarray operator. But then the index is not applicable, because the intarray operator only operates on integer
(int4
) not int2
. And indexes are strictly bound to operators:
- Can PostgreSQL index array columns?
- PostgreSQL behavior in presence of two different type of indexes on the same column
但对于 int2 [] @> int2 []
,Postgres无法决定最佳运营商。两者似乎同样适用。由于默认运算符在 pg_catalog
架构中提供,而intarray运算符在 public
架构中提供(默认情况下 - 或者无论您在何处安装扩展程序),您都可以通过使用 OPERATOR()
构造。相关:
But for int2[] @> int2[]
, Postgres is unable to decide the best operator. Both seem equally applicable. Since the default operator is provided in the pg_catalog
schema and the intarray operator is provided in the public
schema (by default - or wherever you installed the extension), you can help solve the conundrum by schema-qualifying the operator with the OPERATOR()
construct. Related:
- Compare arrays for equality, ignoring order of elements
错误你得到的消息有点误导。但如果仔细观察,会添加一个 HINT
行,提示(tada!)方向正确:
The error message you get is a bit misleading. But if you look closely, there is a HINT
line added which hints (tada!) in the right direction:
ERROR: operator is not unique: smallint[] @> smallint[]
LINE 1: SELECT NULL::int2[] @> NULL::int2[]
^
HINT: Could not choose a best candidate operator. You might need to add explicit type casts.
您可以调查的现有运营商候选人@>
with:
You can investigate existing operator candidates for @>
with:
SELECT o.oid, *, oprleft::regtype, oprright::regtype, n.nspname
FROM pg_operator o
JOIN pg_namespace n ON n.oid = o.oprnamespace
WHERE oprname = '@>';
另一个替代解决方案是临时(!)设置不同的search_path,因此只有所需的运算符是找到。在同一笔交易中:
Another alternative solution would be to temporarily(!) set a different search_path, so only the desired operator is found. In the same transaction:
SET LOCAL search_path = pg_catalog;
SELECT ...
但是你必须对查询中的所有表进行模式限定。
But then you have to schema-qualify all tables in the query.
关于演员表:
- Generate series of dates - using date type as input
你可以更改的
- > castcontext
> int2 int4
。但我强烈建议不要这样做。太多可能的副作用:
You could change the castcontext
of int2
-> int4
. But I strongly advise against it. Too many possible side effects:
- Is there any way to cast postgresql 9.3 data type so that it can affect only one side
这篇关于未使用smallint []列的GIN索引或错误“运算符不唯一”的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!