比较两个清单中每个清单的唯一项 [英] Comparing two lists for unique items in each
问题描述
我有两个集合(它们碰巧是数组,但我认为这并不重要): L
和 R
.它们都已排序,现在我想对其进行比较.我想结束两个集合:一个集合用于每个包含不在另一个集合中的项的输入数组.
我可以从 L
中取出第一项,然后搜索 R
,如果没有匹配项,请将其添加到我的唯一"集合中( Lu
).但这效率极低,我希望在不久的将来可以处理一些非常大的收藏.
我大概会玩跳房子":
-
步骤1:获取两个列表,分别为
L
和R
,然后比较每个列表的开头(l :: L
和r :: R
):-
分支1:如果
l
<r
,然后将l
添加到Lu
并递归,传入L
和r :: R
-
分支2:如果
l
>r
,然后将r
添加到Ru
并递归,传入l :: L
和R
-
分支3:如果
l
=r
,则递归,传入L
和R
-
-
步骤2:返回
Lu
和Ru
我可以编写此函数,但是在进行努力之前,我想知道是否已经存在可以为我完成此功能的函数.看来这是一种罕见的情况,我总是宁愿使用现有的解决方案来滚动自己的解决方案.
(此外,如果该算法有一个更易辨认的名称,我想知道它的名字.)
(我大约2小时前在上面写下了问题.此后,我自己找到了答案.以下是我发现的问题.)
在集合论中,L中的项目列表"而不是R中的项目的列表被称为"L中R的相对互补",也称为"L和R的集合理论差异"
(请参阅Wikipedia的. Set.difference
只是从第一个参数中减去第二个参数,因此您实际上可以使用以下代码:
让Lu = L-R |>Set.toArray令Ru = R-L |>.Set.toArray
>val Lu:int [] = [| 1 |]>val Ru:int [] = [| 4 |]
I have two collections (they happen to be arrays, but it doesn't really matter, I think): L
and R
. They are both sorted and now I want to compare them. I want to end up with two collections: one for each input array containing the items which were not in the other.
I could just take the first item from L
and then search R
and, if there isn't a match, add it to my "unique" collection (Lu
). But that's extremely inefficient, and I am expecting to have some very large collections to process in the near future.
I though about possibly "playing hopscotch":
Step 1: Take two lists,
L
andR
, and compare the head of each list (l :: L
andr :: R
):Branch 1: if
l
<r
, then addl
toLu
and recurse, passing inL
andr :: R
Branch 2: if
l
>r
, then addr
toRu
and recurse, passing inl :: L
andR
Branch 3: if
l
=r
, then recurse, passing inL
andR
Step 2: return
Lu
andRu
I can write this function, but before I put in the effort I was wondering if a function already exists which can do this for me. It seems like a not-to-uncommon scenario, and I'd always rather use an existing solution to rolling my own.
(Also, if there's a more recognizable name for this algorithm, I'd like to know what it's called.)
(I wrote the question above about 2 hours ago. Since then, I found the answer on my own. The following is what I discovered.)
In set theory, the "list" of items in L but not in R is known as "the relative complement of R in L", also known as "set-theoretic difference of L and R"
(See Wikipedia's Complement (set theory) article)
F#, being a mathematical language, has this concept baked right in to it's Core library. First, you need to build your collections as sets:
// example arrays:
let arr1 = [| 1; 2; 3 |]
let arr2 = [| 2; 3; 4 |]
// build the L and R sets
let L = set arr1
let R = set arr2
Now you can call the "difference" function and quickly get the relative complement for each array:
let Lu = Set.difference L R |> Set.toArray
let Ru = Set.difference R L |> Set.toArray
> val Lu : int [] = [|1|]
> val Ru : int [] = [|4|]
There's also a shorter syntax. The Set type has overloaded the minus operator. Set.difference
just subtracts the second parameter from the first, so you can actually just use the following:
let Lu = L - R |> Set.toArray
let Ru = R - L |> Set.toArray
> val Lu : int [] = [|1|]
> val Ru : int [] = [|4|]
这篇关于比较两个清单中每个清单的唯一项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!