JSON 上的 PostgreSQL 索引 [英] PostgreSQL Index on JSON

查看:31
本文介绍了JSON 上的 PostgreSQL 索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用 Postgres 9.4,我想在 json 列上创建一个索引,该索引将在搜索列中的特定键时使用.

Using Postgres 9.4, I want to create an index on a json column that will be used when searching on specific keys within the column.

例如,我有一个带有 json 列animals"的farm"表.

For example I have an 'farm' table with a json column 'animals'.

animals 列有通用格式的 json 对象:

The animals column has json objects of the general format:

'{"cow": 2, "chicken": 11, "horse": 3}'

我尝试了多个索引(单独):

I have tried a number of indexes (separately):

  1. create INDEX animal_index ON farm ((animal ->> 'cow'));
  2. 使用 gin ((animal ->> 'cow')) 在农场上创建 INDEX Animal_index;
  3. 使用 gist ((animal ->> 'cow')) 在农场上创建 INDEX Animal_index;

我想运行如下查询:

SELECT * FROM farm WHERE (animal ->> 'cow') > 3;

并让该查询使用索引.

当我运行这个查询时:

SELECT * FROM farm WHERE (animal ->> 'cow') is null;

那么 (1) 索引有效,但我无法让任何索引适用于不等式.

then the (1) index works, but I can't get any of the indexes to work for the inequality.

这样的索引可行吗?

农场表只包含约 5000 个农场,但其中一些包含 100 只动物,而且查询对我的用例来说太长了.像这样的索引是我能想到的加快查询速度的唯一方法,但也许还有另一种选择.

The farm table contains only ~5000 farms, but some of them contain 100s of animals and the queries simply take too long for my use case. An index like this is the only method I can think of for speeding this query up, but perhaps there is another option.

推荐答案

因为 ->> operator 返回 text,而您显然已经记住了 jsonb gin 运算符类.请注意,您只提到了 json,但实际上您需要 jsonb 用于高级索引功能.

Your other two indexes won't work simply because the ->> operator returns text, while you obviously have the jsonb gin operator classes in mind. Note that you only mention json, but you actually need jsonb for advanced indexing capabilities.

要制定最佳索引策略,您必须更详细地定义要涵盖的查询.你只对牛感兴趣吗?还是所有动物/所有标签?哪些运算符是可能的?您的 JSON 文档是否还包含非动物键?拿那些怎么办?您是否想在索引中包含奶牛(或其他任何东西)根本没有出现在 JSON 文档中的行?

To work out the best indexing strategy, you'd have to define more closely which queries to cover. Are you only interested in cows? Or all animals / all tags? Which operators are possible? Does your JSON document also include non-animal keys? What to do with those? Do you want to include rows in the index where cows (or whatever) don't show up in the JSON document at all?

假设:

  • 我们只对第一层筑巢的奶牛感兴趣.
  • 该值始终是有效的整数.
  • 我们对没有奶牛的行不感兴趣.

我建议使用函数式 btree 索引,就像您已经拥有的那样,但将值转换为 integer.我不认为您希望将比较评估为 text(其中2"大于1111").

I suggest a functional btree index, much like you already have, but cast the value to integer. I don't suppose you'd want the comparison evaluated as text (where '2' is greater than '1111').

CREATE INDEX animal_index ON farm (((animal ->> 'cow')::int));  -- !

强制转换速记需要一组额外的括号,以使索引表达式的语法明确.

The extra set of parentheses is required for the cast shorthand to make the syntax for the index expression unambiguous.

在你的查询中使用相同的表达式让 Postgres 意识到索引是适用的:

Use the same expression in your queries to make Postgres realize the index is applicable:

SELECT * FROM farm WHERE (animal ->> 'cow')::int > 3;

如果您需要更通用的 jsonb 索引,请考虑:

If you need a more generic jsonb index, consider:

对于已知的、静态的、微不足道的数量的动物(如您评论的那样),我建议使用部分索引,例如:

For a known, static, trivial number of animals (like you commented), I suggest partial indexes like:

CREATE INDEX animal_index ON farm (((animal ->> 'cow')::int))
WHERE (animal ->> 'cow') IS NOT NULL;

CREATE INDEX animal_index ON farm (((animal ->> 'chicken')::int))
WHERE (animal ->> 'chicken') IS NOT NULL;

您可能需要在查询中添加索引条件:

You may have to add the index condition to the query:

SELECT * FROM farm
WHERE (animal ->> 'cow')::int > 3
AND   (animal ->> 'cow') IS NOT NULL; 

可能看起来多余,但可能是必要的.使用 ANALYZE 进行测试!

May seem redundant, but may be necessary. Test with ANALYZE!

这篇关于JSON 上的 PostgreSQL 索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆