查找所有子项完全匹配的父项 ID [英] Find ID of parent where all children exactly match

查看:74
本文介绍了查找所有子项完全匹配的父项 ID的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们有一组代表四个关键概念的数据库表:

Let's suppose we have a set of database tables that represent four key concepts:

  1. 实体类型(例如帐户、客户等)
  2. 实体(例如上述实体类型的实例)
  3. 群组(命名组)
  4. 群组成员(组成群组成员的实体)

关于群组的规则是:

  1. 一个群组总是至少有一个群组成员.
  2. 同类群组成员必须是该同类群组的唯一成员(即实体 5 不能成为群组 3 的成员两次,但它可以是群组 3 和群组 4 的成员)
  3. 没有任何两个群组的成员资格完全相同,尽管一个群组可能合法地是另一个群组的子集.

关于实体的规则是:

  1. 任何两个实体都不能有相同的值对 (business_key, entity_type_id)
  2. 具有不同entity_type_id的两个实体可能共享一个business_key
  1. No two Entities may have the same value pair (business_key, entity_type_id)
  2. Two entities with a different entity_type_id may share a business_key

因为图片说明了一千行代码,这里是ERD:

Because pictures tell a thousand lines of code, here is the ERD:

我想要一个 SQL 查询,当提供一组 (business_key, entity_type_id) 对时,将搜索一个与完全匹配的群组,返回一个如果该群组存在,则仅包含 cohort_id 的行,否则为零行.

I want a SQL query that, when provided a collection of (business_key, entity_type_id) pairs, will search for a Cohort that matches exactly, returning one row with just the cohort_id if that Cohort exists, and zero rows otherwise.

即- 如果实体集匹配entity_ids 1 和 2,它只会返回一个 cohort_id,其中 cohort_members 正好是 1 和 2,而不仅仅是1,不只是 2,不是 entity_ids 1 2 和 3 的同类群组.如果不存在满足此条件的同类群组,则返回零行.

i.e. - if the set of Entities matchesentity_ids 1 and 2, it will only return a cohort_id where the cohort_members are exactly 1 and 2, not just 1, not just 2, not a cohort with entity_ids 1 2 and 3. If no cohort exists that satisfies this, then zero rows are returned.

为了帮助人们解决这个问题,我创建了一个表格以及一些定义各种实体类型、实体和群组的数据.还有一个包含用于匹配的测试数据的表,名为 test_cohort.它包含 6 个测试群组,用于测试各种场景.前 5 项测试应仅与一个队列完全匹配.第 6 个测试是一个测试零行子句的虚假测试.使用测试表时,关联的 INSERT 语句应该只有一行未注释(参见 fiddle,它最初是这样设置的):

To help people addressing the question, I have created a fiddle of the tables along with some data that defines various Entity Types, Entities, and Cohorts. There is also a table with test data for matching, named test_cohort. It contains 6 test cohorts which test various scenarios. The first 5 tests should exactly match just one cohort. The 6th test is a bogus one to test the zero-row clause. When using the test table, the associated INSERT statement should just have one line uncommented (see fiddle, it's set up like that initially):

http://sqlfiddle.com/#!18/2d022

我在 SQL 中的尝试如下,尽管它没有通过测试 #2 和 #4(可以在小提琴中找到):

My attempt in SQL is the following, though it fails tests #2 and #4 (which can be found in the fiddle):

SELECT actual_cohort_member.cohort_id
FROM test_cohort
INNER JOIN entity
    ON entity.business_key = test_cohort.business_key
    AND entity.entity_type_id = test_cohort.entity_type_id
INNER JOIN cohort_member AS existing_potential_member
    ON existing_potential_member.entity_id = entity.entity_id
INNER JOIN cohort
    ON cohort.cohort_id = existing_potential_member.cohort_id
RIGHT OUTER JOIN cohort_member AS actual_cohort_member
    ON actual_cohort_member.cohort_id = cohort.cohort_id
    AND actual_cohort_member.cohort_id = existing_potential_member.cohort_id
    AND actual_cohort_member.entity_id = existing_potential_member.entity_id
GROUP BY actual_cohort_member.cohort_id
HAVING
    SUM(CASE WHEN
        actual_cohort_member.cohort_id = existing_potential_member.cohort_id AND
        actual_cohort_member.entity_id = existing_potential_member.entity_id THEN 1 ELSE 0
    END) = COUNT(*)
;

推荐答案

这种情况可以通过在 WHERE 子句中添加复合条件来实现,因为您要与一对值进行比较.然后,您必须根据 WHERE 子句中设置的条件以及 cohort_id 的总行数计算结果.

This scenario can be achieve by adding compound condition in the WHERE clause since you're comparing to a pair value. Then you have to count the result based from the conditions set in the WHERE clause as well as the total rows by of the cohort_id.

SELECT  c.cohort_id
FROM    cohort c
        INNER JOIN cohort_member cm
            ON c.cohort_id = cm.cohort_id
        INNER JOIN entity e
            ON cm.entity_id = e.entity_id
WHERE   (e.entity_type_id = 1 AND e.business_key = 'acc1')      -- condition here
         OR (e.entity_type_id = 1 AND e.business_key = 'acc2')
GROUP   BY c.cohort_id
HAVING  COUNT(*) = 2                                            -- number must be the same to the total number of condition
        AND (SELECT COUNT(*) 
             FROM cohort_member cm2 
             WHERE cm2.cohort_id = c.cohort_id) = 2             -- number must be the same to the total number of condition

  • 测试用例 #1
  • 测试用例#2
  • 测试用例 #3
  • 测试用例 #4
  • 测试用例 #5
  • 测试用例 #6
  • 正如您在上面的测试用例中看到的,过滤器中的值取决于 WHERE 子句中的条件数.建议为此创建一个动态查询.

    As you can see in the test cases above, the value in the filter depends on the number of conditions in the WHERE clause. It would be advisable to create a dynamic query on this.

    更新

    如果表 test_cohort 只包含一个场景,那么这足以满足您的要求,但是,如果 test_cohort 包含场景列表,那么您可能需要查看另一个回答,因为此解决方案不会改变任何表架构.

    If the table test_cohort contains only one scenario, then this will suffice your requirement, however, if test_cohort contains list of scenarios then you might want to look in the other answer since this solution does not alter any table schema.

    SELECT  c.cohort_id
    FROM    cohort c
            INNER JOIN cohort_member cm
                ON c.cohort_id = cm.cohort_id
            INNER JOIN entity e
                ON cm.entity_id = e.entity_id
            INNER JOIN test_cohort tc
                ON tc.business_key = e.business_key
                    AND tc.entity_type_id = e.entity_type_id
    GROUP   BY c.cohort_id
    HAVING  COUNT(*) = (SELECT COUNT(*) FROM test_cohort)
            AND (SELECT COUNT(*) 
                 FROM cohort_member cm2 
                 WHERE cm2.cohort_id = c.cohort_id) = (SELECT COUNT(*) FROM test_cohort)
    

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆