用于在 JSON 数组中查找元素的索引 [英] Index for finding an element in a JSON array

查看:63
本文介绍了用于在 JSON 数组中查找元素的索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一张看起来像这样的表格:

I have a table that looks like this:

CREATE TABLE tracks (id SERIAL, artists JSON);

INSERT INTO tracks (id, artists) 
  VALUES (1, '[{"name": "blink-182"}]');

INSERT INTO tracks (id, artists) 
  VALUES (2, '[{"name": "The Dirty Heads"}, {"name": "Louis Richards"}]');

还有其他几列与此问题无关.将它们存储为 JSON 是有原因的.

There's several other columns that aren't relevant to this question. There's a reason to have them stored as JSON.

我要做的是查找具有特定艺术家姓名(完全匹配)的曲目.

What I'm trying to do is lookup a track that has a specific artist name (exact match).

我正在使用这个查询:

SELECT * FROM tracks 
  WHERE 'ARTIST NAME' IN
    (SELECT value->>'name' FROM json_array_elements(artists))

例如

SELECT * FROM tracks
  WHERE 'The Dirty Heads' IN 
    (SELECT value->>'name' FROM json_array_elements(artists))

然而,这会进行全表扫描,而且速度不是很快.我尝试使用函数 names_as_array(artists) 创建 GIN 索引,并使用 'ARTIST NAME' = ANY names_as_array(artists),但是没有使用索引,并且查询实际上要慢得多.

However, this does a full table scan, and it isn't very fast. I tried creating a GIN index using a function names_as_array(artists), and used 'ARTIST NAME' = ANY names_as_array(artists), however the index isn't used and the query is actually significantly slower.

推荐答案

jsonb in Postgres 9.4+

二进制 JSON 数据类型 jsonb 大大改进了索引选项.您现在可以直接在 jsonb 数组上拥有 GIN 索引:

jsonb in Postgres 9.4+

The binary JSON data type jsonb largely improves index options. You can now have a GIN index on a jsonb array directly:

CREATE TABLE tracks (id serial, artists jsonb);  -- !
CREATE INDEX tracks_artists_gin_idx ON tracks USING gin (artists);

不需要函数来转换数组.这将支持查询:

No need for a function to convert the array. This would support a query:

SELECT * FROM tracks WHERE artists @> '[{"name": "The Dirty Heads"}]';

@>jsonb 包含"运算符,可以使用GIN索引.(不适用于json,仅适用于jsonb!)

@> being the jsonb "contains" operator, which can use the GIN index. (Not for json, only jsonb!)

或者您使用更专业的非默认 GIN 运算符类 jsonb_path_ops 用于索引:

Or you use the more specialized, non-default GIN operator class jsonb_path_ops for the index:

CREATE INDEX tracks_artists_gin_idx ON tracks
USING  gin (artists jsonb_path_ops);  -- !

相同的查询.

目前 jsonb_path_ops 只支持 @> 操作符.但它通常更小、更快.还有更多索引选项,手册中的详细信息.

Currently jsonb_path_ops only supports the @> operator. But it's typically much smaller and faster. There are more index options, details in the manual.

如果artists 只包含示例中显示的名称,则将 values 仅存储为 JSON 文本会更有效原语和多余的可以是列名.

If the column artists only holds names as displayed in the example, it would be more efficient to store just the values as JSON text primitives and the redundant key can be the column name.

注意 JSON 对象和原始类型之间的区别:

Note the difference between JSON objects and primitive types:

CREATE TABLE tracks (id serial, artistnames jsonb);
INSERT INTO tracks  VALUES (2, '["The Dirty Heads", "Louis Richards"]');

CREATE INDEX tracks_artistnames_gin_idx ON tracks USING gin (artistnames);

查询:

SELECT * FROM tracks WHERE artistnames ? 'The Dirty Heads';

? 不适用于对象 values,仅适用于 keysarray 元素.

? does not work for object values, just keys and array elements.

或者:

CREATE INDEX tracks_artistnames_gin_idx ON tracks
USING  gin (artistnames jsonb_path_ops);

查询:

SELECT * FROM tracks WHERE artistnames @> '"The Dirty Heads"'::jsonb;

如果名称高度重复,则效率更高.

More efficient if names are highly duplicative.

这应该适用于 IMMUTABLE 功能:

This should work with an IMMUTABLE function:

CREATE OR REPLACE FUNCTION json2arr(_j json, _key text)
  RETURNS text[] LANGUAGE sql IMMUTABLE AS
'SELECT ARRAY(SELECT elem->>_key FROM json_array_elements(_j) elem)';

创建这个功能索引:

CREATE INDEX tracks_artists_gin_idx ON tracks
USING  gin (json2arr(artists, 'name'));

并像这样使用查询.WHERE 子句中的表达式必须与索引中的表达式匹配:

And use a query like this. The expression in the WHERE clause has to match the one in the index:

SELECT * FROM tracks
WHERE  '{"The Dirty Heads"}'::text[] <@ (json2arr(artists, 'name'));

更新了评论中的反馈.我们需要使用数组运算符来支持GIN 索引.
由"包含运算符 <@ 在这种情况下.

Updated with feedback in comments. We need to use array operators to support the GIN index.
The "is contained by" operator <@ in this case.

你可以声明你的函数 IMMUTABLE 即使 json_array_elements() ist 不是.
大多数JSON 函数过去只是STABLE,而不是IMMUTABLE.有一个关于黑客的讨论 大多数都是 IMMUTABLE 现在.检查:

You can declare your function IMMUTABLE even if json_array_elements() isn't wasn't.
Most JSON functions used to be only STABLE, not IMMUTABLE. There was a discussion on the hackers list to change that. Most are IMMUTABLE now. Check with:

SELECT p.proname, p.provolatile
FROM   pg_proc p
JOIN   pg_namespace n ON n.oid = p.pronamespace
WHERE  n.nspname = 'pg_catalog'
AND    p.proname ~~* '%json%';

函数索引仅适用于 IMMUTABLE 函数.

Functional indexes only work with IMMUTABLE functions.

这篇关于用于在 JSON 数组中查找元素的索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆