PostgreSQL 全部在数组中 [英] PostgreSQL where all in array

查看:22
本文介绍了PostgreSQL 全部在数组中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

What is the easiest and fastest way to achieve a clause where all elements in an array must be matched - not only one when using IN? After all it should behave like mongodb's $all.

Thinking about group conversations where conversation_users is a join table between conversation_id and user_id I have something like this in mind:

WHERE (conversations_users.user_id ALL IN (1,2))


UPDATE 16.07.12

Adding more info about schema and case:

  1. The join-table is rather simple:

                  Table "public.conversations_users"
         Column      |  Type   | Modifiers | Storage | Description 
    -----------------+---------+-----------+---------+-------------
     conversation_id | integer |           | plain   | 
     user_id         | integer |           | plain   | 
    

  2. A conversation has many users and a user belongs to many conversations. In order to find all users in a conversation I am using this join table.

  3. In the end I am trying to figure out a ruby on rails scope that find's me a conversation depending on it's participants - e.g.:

    scope :between, ->(*users) {
      joins(:users).where('conversations_users.user_id all in (?)', users.map(&:id))
    }
    

UPDATE 23.07.12

My question is about finding an exact match of people. Therefore:

Conversation between (1,2,3) won't match if querying for (1,2)

解决方案

Assuming the join table follows good practice and has a unique compound key defined, i.e. a constraint to prevent duplicate rows, then something like the following simple query should do.

select conversation_id from conversations_users where user_id in (1, 2)
group by conversation_id having count(*) = 2

It's important to note that the number 2 at the end is the length of the list of user_ids. That obviously needs to change if the user_id list changes length. If you can't assume your join table doesn't contain duplicates, change "count(*)" to "count(distinct user_id)" at some possible cost in performance.

This query finds all conversations that include all the specified users even if the conversation also includes additional users.

If you want only conversations with exactly the specified set of users, one approach is to use a nested subquery in the where clause as below. Note, first and last lines are the same as the original query, only the middle two lines are new.

select conversation_id from conversations_users where user_id in (1, 2)
   and conversation_id not in
   (select conversation_id from conversations_users where user_id not in (1,2))
group by conversation_id having count(*) = 2

Equivalently, you can use a set difference operator if your database supports it. Here is an example in Oracle syntax. (For Postgres or DB2, change the keyword "minus" to "except.)

select conversation_id from conversations_users where user_id in (1, 2)
  group by conversation_id having count(*) = 2
minus
  select conversation_id from conversations_users where user_id not in (1,2)

A good query optimizer should treat the last two variations identically, but check with your particular database to be sure. For example, the Oracle 11GR2 query plan sorts the two sets of conversation ids before applying the minus operator, but skips the sort step for the last query. So either query plan could be faster depending on multiple factors such as the number of rows, cores, cache, indices etc.

这篇关于PostgreSQL 全部在数组中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆