使用更大的运算符在数组中进行Postgres jsonb搜索(使用jsonb_array_elements) [英] Postgres jsonb search in array with greater operator (with jsonb_array_elements)

查看:67
本文介绍了使用更大的运算符在数组中进行Postgres jsonb搜索(使用jsonb_array_elements)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试搜索解决方案,但没有找到适合我的情况...

I try to search a solution but I didn't find anything for my case...

这是数据库声明(简体):

Here is the database declaration (simplified):

CREATE TABLE documents (
    document_id int4 NOT NULL GENERATED BY DEFAULT AS IDENTITY,
    data_block jsonb NULL
);

这是插入的示例.

INSERT INTO documents (document_id, data_block)
VALUES(878979, 
    {"COMMONS": {"DATE": {"value": "2017-03-11"}},
     "PAYABLE_INVOICE_LINES": [
         {"AMOUNT": {"value": 52408.53}}, 
         {"AMOUNT": {"value": 654.23}}
     ]});
INSERT INTO documents (document_id, data_block)
VALUES(977656, 
    {"COMMONS": {"DATE": {"value": "2018-03-11"}},
     "PAYABLE_INVOICE_LINES": [
         {"AMOUNT": {"value": 555.10}}
     ]});

我要搜索PAYABLE_INVOICE_LINES之一所在行的值大于1000.00的所有文档

I want to search all documents where one of the PAYABLE_INVOICE_LINES has a line with a value greater than 1000.00

我的查询是

select *
from documents d
cross join lateral jsonb_array_elements(d.data_block -> 'PAYABLE_INVOICE_LINES') as pil 
where (pil->'AMOUNT'->>'value')::decimal >= 1000

但是,由于要限制为50个文档,我必须对document_id进行分组,并将结果限制为50个.

But, as I want to limit to 50 documents, I have to group on the document_id and limit the result to 50.

具有数百万个文档,此查询非常昂贵...一百万个内容需要10秒.

With millions of documents, this query is very expensive... 10 seconds with 1 million.

您有一些想法可以提高性能吗?

Do you have some ideas to have better performance ?

谢谢

推荐答案

代替cross join lateral使用where exists:

select *
from documents d
where exists (
  select 1
  from jsonb_array_elements(d.data_block -> 'PAYABLE_INVOICE_LINES') as pil
  where (pil->'AMOUNT'->>'value')::decimal >= 1000)
limit 50;


更新

还有另一种方法,不仅更复杂,而且效率更高.

And yet another method, more complex but also much more efficient.

创建从您的JSONB数据返回最大值的函数,如下所示:

Create function that returns max value from your JSONB data, like this:

create function fn_get_max_PAYABLE_INVOICE_LINES_value(JSONB) returns decimal language sql as $$
  select max((pil->'AMOUNT'->>'value')::decimal)
  from jsonb_array_elements($1 -> 'PAYABLE_INVOICE_LINES') as pil $$

在此函数上创建索引:

create index idx_max_PAYABLE_INVOICE_LINES_value
  on documents(fn_get_max_PAYABLE_INVOICE_LINES_value(data_block));

在查询中使用功能:

select *
from documents d
where fn_get_max_PAYABLE_INVOICE_LINES_value(data_block) > 1000
limit 50;

在这种情况下,将使用索引,并且对大量数据的查询将更快.

In this case the index will be used and query will be much faster on large amount of data.

PS:通常limitorder by配对是有意义的.

PS: Usually limit have sense in pair with order by.

这篇关于使用更大的运算符在数组中进行Postgres jsonb搜索(使用jsonb_array_elements)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆