如何修剪重复的关联，产生了独特的最完整的一套 [英] How to prune duplicate associations to yield a unique most-complete set

查看：61 发布时间：2015/11/30 21:16:15 sql algorithm sql-server-2005 graph associations

本文介绍了如何修剪重复的关联，产生了独特的最完整的一套的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我不知道如何来说明这个问题，更不用说寻找答案。但这里是我最好的拍摄。假设我有一个表

I hardly know how to state this question, let alone search for answers. But here's my best shot. Assume I have a table

Col1   Col2
-----+-----
 A   | 1
 A   | 2
 A   | 3
 A   | 4
 B   | 1
 B   | 2
 B   | 3
 C   | 1
 C   | 2
 C   | 3
 D   | 1

我想找到协会（行）的子集，其中：

I want to find the subset of associations (rows) where:

有没有重复的COL1
有没有重复的col2的
在COL1每个值与col2的一个值相关联

所以上面的例子中可能会产生这样的结果。

So the above example could yield this result

Col1   Col2
-----+-----
 A   | 4
 B   | 2
 C   | 3
 D   | 1

注意，A-4必须是结果，因为有4个不同的字母和独特的4号，所以如果你不关联A至4，没有子集剩余未映射在col1中的每个值，同时保留该col2的唯一性。

Notice that A-4 must be in the result because there are 4 unique letters and unique 4 numbers, so if you don't associate A to 4, there's no subset remaining that doesn't map every value in Col1 while retaining the uniqueness of Col2.

另外，请注意，这将是同样有效的替换B-2和C-3与B-3和C-2。我不关心选择哪个子集，但我想要一个满足所有要求。

Also, notice that it would be equally valid to replace B-2 and C-3 with B-3 and C-2. I don't care which subset is selected, but I want one that fulfills all the requirements.

不是每个数据集都会有一个子集，满足所有要求，但我希望得到尽可能接近。

Not every set of data will have a sub-set that fulfills all the requirements, but I want to get as close as possible.

我试图用一个SQL查询做到这一点。我似乎做到这一点的一组数据的查询，但后来我不得不把它改写了一套略有不同（其中col2的其实是一对列），无法生育我先前的成功。我的第一个解决方案中使用MIN（）和GROUP BY和一对夫妇加盟的汇总结果，以纪念为重复消除一个循环，直到有没有留下来安全地消除。我最近的解决方案取代本集团通过与ROW_NUMBER（）使用PARTITION_BY EX pressions查询。但我无法弄清楚如何处理那里有来自像B和C乘交联对在上面的例子多有效的结果集的情况下。我早期的查询可能已经处理了，但我不能完全融为一体prehend我做了什么（必须有一个很好的一天，当我写的一个）。也许我需要做一个JOIN的ROW_NUMBER EX pressions在我的子查询？我的大脑给出了今天。我希望有人能帮助我找到一个巧妙简单的解决方案。

I'm trying to do this with a SQL query. I had a query that seemed to accomplish this for one set of data, but then I had to rewrite it for a slightly different set (where Col2 is actually a pair of columns) and could not reproduce my earlier success. My first solution used Min() and Group By and a couple Joins on aggregated results to mark duplicates for elimination in a loop until there was nothing left to safely eliminate. My more recent solution replaces the Group By queries with ROW_NUMBER() expressions that use PARTITION_BY. But I can't figure out how to handle the cases where there are multiple valid result sets from multiply-cross-linked pairs like B and C in the above example. My earlier query might have handled it, but I can't quite comprehend what I did (must have had a good day when I wrote that one). Perhaps I need to do a JOIN on the ROW_NUMBER expressions in my sub-queries? My brain gave out for today. I hope someone can help me find an ingeniously simple solution.

如何修剪重复的关联，产生了独特的最完整的一套 [英] How to prune duplicate associations to yield a unique most-complete set

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录关闭

如何修剪重复的关联，产生了独特的最完整的一套 [英] How to prune duplicate associations to yield a unique most-complete set

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录 关闭

登录关闭