以最优化的方式交叉两个集合 [英] Intersection of two sets in most optimized way

查看:29
本文介绍了以最优化的方式交叉两个集合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定两组值,我必须找出它们之间是否有任何共同元素,即它们的交集是否为空.

Given two sets of values, I have to find whether there is any common element among them or not i.e. whether their intersection is null or not.

哪个标准 C# 集合最适合此目的(就性能而言)?我知道 linq 有一个 Intersect 扩展方法来找出两个列表/数组的交集,但我的重点是 Big-O notation.

Which of the standard C# collection will suit best (in terms of performance) for this purpose? I know that linq has a Intersect extension method to find out the intersection of two list/arrays but my focus is on performance in terms of Big-O notation.

如果我还必须找出两组的交集呢?

And what if I have to find out the intersection of two sets as well?

推荐答案

好吧,如果你使用 LINQ 的 Intersect 方法,它会构建第二个序列的 HashSet,然后对照它检查第一个序列的每个元素.所以它是 O(M+N)... 你可以使用 foo.Intersect(bar).Any() 提前退出.

Well, if you use LINQ's Intersect method it will build up a HashSet of the second sequence, and then check each element of the first sequence against it. So it's O(M+N)... and you can use foo.Intersect(bar).Any() to get an early-out.

当然,如果您将一个(任一)集合存储在 HashSet<T> 中,您可以在每个步骤中迭代另一个以检查是否包含.不过,您仍然需要先构建集合.

Of course, if you store one (either) set in a HashSet<T> to start with, you can just iterate over the other one checking for containment on each step. You'd still need to build the set to start with though.

从根本上说,无论你做什么,你都会遇到 O(M+N) 问题 - 你不会得到比这更便宜的(总是你必须看看的可能性每个元素),如果您的哈希码是合理的,您应该能够轻松实现这种复杂性.当然,某些解决方案可能会比其他解决方案提供更好的常数因子……但这是性能而不是复杂性;)

Fundamentally you've got an O(M+N) problem whatever you do - you're not going to get cheaper than that (there's always the possibility that you'll have to look at every element) and if your hash codes are reasonable, you should be able to achieve that complexity easily. Of course, some solutions may give better constant factors than others... but that's performance rather than complexity ;)

如评论中所述,还有 ISet<T>.Overlaps - 如果您已经设置了静态类型的 ISet<T> 或具体实现,请调用Overlaps 让你在做什么更清楚.如果 both 您的集合都静态类型为 ISet<T>,请使用 larger.Overlaps(smaller)(其中更大和更小是集合的大小),因为我希望 Overlaps 的实现迭代 argument 并根据您调用它的集合的内容检查每个元素.

As noted in the comments, there's also ISet<T>.Overlaps - if you've already got either set with a static type of ISet<T> or a concrete implementation, calling Overlaps makes it clearer what you're doing. If both of your sets are statically typed as ISet<T>, use larger.Overlaps(smaller) (where larger and smaller are in terms of the size of the set) as I'd expect an implementation of Overlaps to iterate over the argument and check each element against contents of the set you call it on.

这篇关于以最优化的方式交叉两个集合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆