未使用smallint []列的GIN索引或错误“运算符不唯一” [英] GIN index on smallint[] column not used or error "operator is not unique"

查看:166
本文介绍了未使用smallint []列的GIN索引或错误“运算符不唯一”的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

create table test(
    id serial primary key,
    tagged smallint[]
);

标记列上有杜松子酒索引,使用 _int2_ops 运算符类:

There is gin index on tagged column, with _int2_ops operator class:

CREATE INDEX ix ON测试使用GIN(col _int2_ops) ;

当我运行此查询时:

select * from test
where tagged @> ARRAY[11]
order by id limit 100;

EXPLAIN ANALYZE 显示:

Limit  (cost=0.43..19524.39 rows=100 width=36) (actual time=25024.124..25027.263 rows=100 loops=1)
  ->  Index Scan using test_pkey on test  (cost=0.43..508404.37 rows=2604 width=36) (actual time=25024.121..25027.251 rows=100 loops=1)
        Filter: ((tagged)::integer[] @> '{11}'::integer[])
        Rows Removed by Filter: 2399999
Planning time: 6.912 ms
Execution time: 25027.307 ms

大胆强调我的。为什么标记的列转换为整数[] 类型?我认为这就是为什么不使用GIN索引并且查询运行缓慢的原因。

Bold emphasis mine. Why is the tagged column converted to integer[] type? I think this is the reason why GIN the index isn't used and the query runs slow.

我试过 WHERE标记@> ARRAY [11] :: smallint [] 但出现此错误:


operator is not unique: smallint[] @> smallint[]


如果我这样做但使用标记为int [] 并创建索引为

If I do the same but use tagged int[] and create index as

CREATE INDEX ix ON test USING GIN(tagged gin__int_ops);

然后上面的查询使用GIN索引:

then the above query uses the GIN index:


"->  Bitmap Index Scan on ix  (cost=0.00..1575.53 rows=2604 width=0) (actual time=382.840..382.840 rows=2604480 loops=1)"
"   Index Cond: (tagged @> '{11}'::integer[])"


这比以前快一点,但平均需要10秒 - 仍然太慢。我想尝试 smallint [] 而不是 int [] ,也许这会更快......

This is a bit faster than previous, but It takes on average 10 second - still too slow. I want to try smallint[] instead of int[], maybe that will be faster ...

推荐答案

解决方案



最有可能的是,解决方案是对运算符进行模式限定:

Solution

Most probably, the solution is to schema-qualify the operator:

SELECT *
FROM   test
WHERE  tagged OPERATOR(pg_catalog.@>) '{11}'::int2[]
ORDER  BY id
LIMIT  100;



为什么?



这是运算符解析的问题(与类型解析和强制转换上下文相结合)。

Why?

It's a problem of operator resolution (in combination with type resolution and cast context).

在标准的Postgres中,只有一个候选运算符 anyarray @> anyarray ,这就是你想要的那个。

In standard Postgres, there is only a single candidate operator anyarray @> anyarray, that's the one you want.

如果你没有安装附加模块 intarray (我的假设),它提供另一个运营商整数[] @>整数[]

因此,另一个解决方案是使用 integer [] gin__int_ops 运算符类。或者尝试(默认为intarray) gist__int_ops 索引。要么可能更快,但两者都不允许NULL值。

或者您可以重命名 intarray 运算符 @> 消除歧义。 (我不会这样做。随后会出现升级和可移植性问题。)

Hence, another solution would be to use integer[] instead and have a GIN index with the gin__int_ops operator class. Or try the (default for intarray) gist__int_ops index. Either might be faster, but both don't allow NULL values.
Or you could rename the intarray operator @> to disambiguate. (I would not do that. Upgrade and portability issues ensue.)

对于涉及至少一个类型整数的操作数的表达式[ ] ,Postgres知道要选择哪个运算符:intarray运算符。但是 索引不适用 ,因为intarray运算符只能在整数上运行( int4 )不是 int2 。索引严格绑定到运营商:

For expressions involving at least one operand of type integer[], Postgres knows which operator to pick: the intarray operator. But then the index is not applicable, because the intarray operator only operates on integer (int4) not int2. And indexes are strictly bound to operators:

  • Can PostgreSQL index array columns?
  • PostgreSQL behavior in presence of two different type of indexes on the same column

但对于 int2 [] @> int2 [] ,Postgres无法决定最佳运营商。两者似乎同样适用。由于默认运算符在 pg_catalog 架构中提供,而intarray运算符在 public 架构中提供(默认情况下 - 或者无论您在何处安装扩展程序),您都可以通过使用 OPERATOR()构造。相关:

But for int2[] @> int2[], Postgres is unable to decide the best operator. Both seem equally applicable. Since the default operator is provided in the pg_catalog schema and the intarray operator is provided in the public schema (by default - or wherever you installed the extension), you can help solve the conundrum by schema-qualifying the operator with the OPERATOR() construct. Related:

  • Compare arrays for equality, ignoring order of elements

错误你得到的消息有点误导。但如果仔细观察,会添加一个 HINT 行,提示(tada!)方向正确:

The error message you get is a bit misleading. But if you look closely, there is a HINT line added which hints (tada!) in the right direction:


ERROR:  operator is not unique: smallint[] @> smallint[]
LINE 1: SELECT NULL::int2[] @> NULL::int2[]
                            ^
HINT:  Could not choose a best candidate operator. You might need to add explicit type casts.


您可以调查的现有运营商候选人@> with:

You can investigate existing operator candidates for @> with:

SELECT o.oid, *, oprleft::regtype, oprright::regtype, n.nspname
FROM   pg_operator o
JOIN   pg_namespace n ON n.oid = o.oprnamespace
WHERE  oprname = '@>';

另一个替代解决方案是临时(!)设置不同的search_path,因此只有所需的运算符是找到。在同一笔交易中:

Another alternative solution would be to temporarily(!) set a different search_path, so only the desired operator is found. In the same transaction:

SET LOCAL search_path = pg_catalog;
SELECT ...

但是你必须对查询中的所有表进行模式限定。

But then you have to schema-qualify all tables in the query.

关于演员表:

  • Generate series of dates - using date type as input

可以更改 castcontext > int2 - > int4 。但我强烈建议不要这样做。太多可能的副作用:

You could change the castcontext of int2 -> int4. But I strongly advise against it. Too many possible side effects:

  • Is there any way to cast postgresql 9.3 data type so that it can affect only one side

这篇关于未使用smallint []列的GIN索引或错误“运算符不唯一”的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆