导轨 - 活动记录:找到这对具有某些属性的has_many关联的计数的所有记录 [英] Rails - Active Record: Find all records which have a count on has_many association with certain attributes

查看:117
本文介绍了导轨 - 活动记录:找到这对具有某些属性的has_many关联的计数的所有记录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一个用户拥有多个身份。

A user has many identities.

class User < ActiveRecord::Base
    has_many :identities
end

class Identity < ActiveRecord::Base
    belongs_to :user
end

这是身份有一个确认:布尔列。我想查询有一个只有一个标识的所有用户。该标识也必须确认假的。

An identity has an a confirmed:boolean column. I'd like to query all users that have an only ONE identity. This identity must also be confirmed false.

我已经试过这种

User.joins(:identities).group("users.id").having( 'count(user_id) = 1').where(identities: { confirmed: false })

但是,这将返回用户提供一个标识确认:假但也可能有其他标识,如果他们被证实属实的我只希望用户只有一个身份确认:虚假和的被证实属性为真没有其他标识

But this returns users with one identity confirmed:false but they could also have additional identities if they are confirmed true. I only want users with only one identity confirmed:false and no additional identities that are have confirmed attribute as true.

我也试过,但显然它的慢,我正在寻找合适的SQL只是做一个查询。

I've also tried this but obviously it's slow and I'm looking for the right SQL to just do this in one query.

  def self.new_users
    users = User.joins(:identities).where(identities: { confirmed: false })
    users.select { |user| user.identities.count == 1 }
  end

前期的道歉,如果这已经回答了,但我无法找到一个类似的帖子。

Apologies upfront if this was already answered but I could not find a similar post.

推荐答案

一种解决方案是使用轨道嵌套查询

One solution is to use rails nested queries

User.joins(:identities).where(id: Identity.select(:user_id).unconfirmed).group("users.id").having( 'count(user_id) = 1')

而这里的查询生成的SQL

And here's the SQL generated by the query

SELECT "users".* FROM "users"
INNER JOIN "identities" ON "identities"."user_id" = "users"."id"
WHERE "users"."id" IN (SELECT "identities"."user_id" FROM "identities"  WHERE "identities"."confirmed" = 'f')
GROUP BY users.id HAVING count(user_id) = 1

我仍然不认为这是最有效的方式。虽然我能够生成只有一个SQL查询(意思是只有一个网络调用数据库),我仍然有做两次扫描:在用户表中的一个扫描和身份表中的一个扫描。这可以通过索引 identities.confirmed 列进行优化,但是这仍然没有解决两个完整的扫描问题。

I still don't think this is the most efficient way. While I'm able to generate only one SQL query (meaning only one network call to the db), I'm still have to do two scans: one scan on the USERS table and one scan on the IDENTITIES table. This can be optimized by indexing the identities.confirmed column but this still doesn't solve the two full scans problem.

对于那些谁了解查询计划,那就是:

For those who understand the query plan here it is:

     QUERY PLAN
-------------------------------------------------------------------------------------------
 HashAggregate  (cost=32.96..33.09 rows=10 width=3149)
   Filter: (count(identities.user_id) = 1)
   ->  Hash Semi Join  (cost=21.59..32.91 rows=10 width=3149)
         Hash Cond: (identities.user_id = identities_1.user_id)
         ->  Hash Join  (cost=10.45..21.61 rows=20 width=3149)
               Hash Cond: (identities.user_id = users.id)
               ->  Seq Scan on identities  (cost=0.00..10.70 rows=70 width=4)
               ->  Hash  (cost=10.20..10.20 rows=20 width=3145)
                     ->  Seq Scan on users  (cost=0.00..10.20 rows=20 width=3145)
         ->  Hash  (cost=10.70..10.70 rows=35 width=4)
               ->  Seq Scan on identities identities_1  (cost=0.00..10.70 rows=35 width=4)
                     Filter: (NOT confirmed)
(12 rows)

这篇关于导轨 - 活动记录:找到这对具有某些属性的has_many关联的计数的所有记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆