在Postgres中有效搜索整个1级嵌套JSONB [英] Effectively searching through entire 1 level nested JSONB in Postgres

查看:62
本文介绍了在Postgres中有效搜索整个1级嵌套JSONB的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

比方说,我们需要检查jsonb列中是否包含与子字符串匹配的特定值,该值与任何值(非嵌套,仅第一级)匹配.

Let's say we need to check if a jsonb column contains a particular value matching by a substring in any of the value (non-nested, only first level).

如何有效地优化查询以在整个JSONB列(即每个键)中搜索值?

How does one effectively optimize a query to search entire JSONB column (this means every key) for a value?

除了对投射到文本的jsonb数据类型进行ILIKE %val%之外,还有其他不错的选择吗?

Is there some good alternative to doing ILIKE %val% on jsonb datatype casted to text?

jsonb_each_text(jsonb_column) ILIKE '%val%'

作为示例,请考虑以下数据:

As an example consider this data:

SELECT 
  '{
   "col1": "somevalue", 
   "col2": 5.5, 
   "col3": 2016-01-01, 
   "col4": "othervalue", 
   "col5": "yet_another_value"
  }'::JSONB

当需要在jsonb列中针对不同行包含不同键配置的记录中搜索模式%val%时,如何优化查询?

How would you go about optimizing a query like that when in need to search for pattern %val% in records containing different keys configuration for different rows in a jsonb column?

我知道用%前后的符号搜索效率不高,因此寻找一种更好的方法却很难找到一种方法.另外,明确地为json列中的所有字段建立索引不是一种选择,因为它们针对每种记录类型而有所不同,并且会创建庞大的索引集(并非每一行都具有相同的键集).

I'm aware that searching with preceding and following % sign is inefficient, thus looking for a better way but having hard time finding one. Also, indexing all the fields within the json column explicitly is not an option since they vary for each type of record and would create a huge set of indexes (not every row has the same set of keys).

问题

除了将每个键值对提取为文本并执行ILIKE/POSIX搜索之外,还有更好的选择吗?

Is there a better alternative to extracting each key-value pair to text and performing an ILIKE/POSIX search?

推荐答案

如果知道只需要查询几个已知键,则只需索引这些表达式即可.

If you know you will need to query only a few known keys, then you can simply index those expressions.

这是一个太简单但可以自我解释的示例:

This is a too simple but self explaining example:

create table foo as SELECT '{"col1": "somevalue", "col2": 5.5, "col3": "2016-01-01", "col4": "othervalue", "col5": "yet_another_value"}'::JSONB as bar;

create index pickfoo1 on foo ((bar #>> '{col1}'));
create index pickfoo2 on foo ((bar #>> '{col2}'));

这是基本思想,即使对 ilike 查询没有用,但是您可以做更多的事情(取决于您的需求).

This is the basic idea, even it isn't useful for ilike querys, but you can do more things (depending on your needs).

例如:如果只需要不区分大小写的匹配,那么就足够了:

For example: If you need only case insensitive matching, it would be sufficient to do:

-- Create index over lowered value:
create index pickfoo1 on foo (lower(bar #>> '{col1}'));
create index pickfoo2 on foo (lower(bar #>> '{col2}'));

-- Check that it matches:
select * from foo where lower(bar #>> '{col1}') = lower('soMEvaLUe');

注意:这仅是一个示例:如果对上一个选择执行解释,您将看到postgres实际上执行了一个 顺序扫描而不是使用索引.但这是因为我们 在只有一行的表上进行测试,这是不常见的.但 我确定您可以使用更大的表进行测试;-)

NOTE: This is only an example: If you perform an explain over the previous select, you will see that postgres actually performs a sequential scan instead of using the index. But this is because we are testing over a table with a single row, which is not the usual. But I'm sure you could test it with a bigger table ;-)

如果firt wilcard没有出现在字符串的开头,那么即使是巨大的表,甚至是 like 查询也应从索引中受益(但这不是jsonb的问题,而是btree对其本身进行索引).

Whith huge tables, even like queries should benefit of the index if the firt wilcard doesn't appear at the beginning of the string (but it isn't a matter of jsonb but a matter of btree indexes itself).

如果您需要优化查询,例如:

If you need to optimize queries like:

select * from foo where bar #>> '{col1}' ilike '%MEvaL%';

...那么您应该考虑改用GIN或GIST索引.

...then you should consider using GIN or GIST indexes instead.

这篇关于在Postgres中有效搜索整个1级嵌套JSONB的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆