使用JSONB列内的值联接表 [英] Join tables using a value inside a JSONB column

查看:85
本文介绍了使用JSONB列内的值联接表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有两个表:

授权联系人(auth_contacts):

(
userid varchar
contacts jsonb
)

contacts包含属性为{contact_id, type}

discussion:

(
contact_id varchar
discussion_id varchar
discussion_details jsonb
)

auth_contacts至少具有10万条记录,使其不适合使用JSONB类型,因为它会使记录数量增加一倍或三倍.

The table auth_contacts has at least 100k records making it non JSONB type is not appropriate according as it would double or triple the amount of records.

auth_contacts的样本数据:

userid  | contacts
'11111' | '{"contact": [{"type": "type_a", "contact_id": "1-A-12"}
                      , {"type": "type_b", "contact_id": "1-A-13"}]}'

discussion表具有500万个奇数记录.

discussion table has 5 million odd records.

我想在discussion.contact_id(关系列)上加入联系人ID,该联系人ID是auth_contacts.contacts中json对象数组内的json对象.

I want to join on discussion.contact_id (relational column) with contact id which a json object inside array of json objects in auth_contacts.contacts.

一种非常粗略的方法是:

One very crude way is:

SELECT *
FROM discussion d 
JOIN (SELECT userid, JSONB_OBJECT_KEYS(a.contacts) AS auth_contact
      FROM auth_contacts a) AS contacts
      ON (d.contact_id = contacts.auth_contact::text)

这实际上是在运行时创建(内部sql)userid vs contact id表(这是我要避免的,因此使用JSONB数据类型 对于具有大量记录的用户的此查询需要花费26 +秒的时间,这并不是一件好事. 尝试了其他几种方法: PostgreSQL 9.4 :在数组内的JSON字段ID上聚集/联接表

What this does is actually at runtime create (inner sql) userid vs contact id table (Which is what I was avoiding and hence went for JSONB data type This query for a user with large records takes 26 + seconds which is not all good. Tried a few other ways: PostgreSQL 9.4: Aggregate / Join table on JSON field id inside array

但是应该有一种更清洁,更好的方法,就像 联接d.contact_id = contacts -> contact -> contact_id? 当我尝试此操作时,不会产生任何结果.

But there should be a cleaner and better way which would be as simple as JOIN d.contact_id = contacts -> contact -> contact_id? When I try this, it doesn't yield any results.

在网上搜索时,这似乎是一项繁琐的工作?

When searching the net this seems to be a pretty cumbersome task?

推荐答案

概念证明

您的粗略方法"实际上是行不通的.这是另一种粗略的方法:

Proof of concept

Your "crude way" doesn't actually work. Here is another crude way that does:

SELECT *
FROM  auth_contacts a
    , jsonb_to_recordset(a.contacts->'contact') AS c(contact_id text)
JOIN  discussion d USING (contact_id);

如前所述,您还可以使用

As has been commented, you can also formulate a join condition with the contains operator @>:

SELECT *
FROM   auth_contacts a
JOIN   discussion d ON a.contacts->'contact'
                    @> json_build_array(json_build_object('contact_id', d.contact_id))::jsonb

但是要使用JSON创建功能,而不要使用字符串连接.看起来很麻烦,但如果受功能性jsonb_path_ops GIN 索引支持,实际上会非常快:

But rather use JSON creation functions than string concatenation. Looks cumbersome but will actually be very fast if supported with a functional jsonb_path_ops GIN index:

CREATE INDEX auth_contacts_contacts_gin_idx ON auth_contacts
USING  gin ((contacts->'contact') jsonb_path_ops);

详细信息:

  • Index for finding an element in a JSON array
  • Postgres 9.4 jsonb array as table

这一切都很有趣,但是这里的问题是关系模型.您的索赔:

This is all fascinating to play with, but the problem here is the relational model. Your claim:

因此使其不为JSONB类型是不合适的,因为它 会使记录数量翻倍或翻三倍.

hence making it non JSONB type is not appropriate according as it would double or triple the amount of records.

与正确的相反.包装将表联接到JSON文档类型所需的ID是废话.通过多对多关系规范化表,并将在数据库内部使用的所有ID分别实现为具有适当数据类型的单独列.基础:

is the opposite of what's right. It's nonsense to wrap IDs you need for joining tables into a JSON document type. Normalize your table with a many-to-many relationship and implement all IDs you are working with inside the DB as separate columns with appropriate data type. Basics:

  • How to perform update operations on columns of type JSONB in Postgres 9.4
  • How to implement a many-to-many relationship in PostgreSQL?

这篇关于使用JSONB列内的值联接表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆