在最优化的方式的两组相交 [英] Intersection of two sets in most optimized way

查看:93
本文介绍了在最优化的方式的两组相交的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

由于两套价值观,我得找我们是否有它们之间或任何共同的元素不就是它们的交集是否为空或不是。<​​/ p>

哪标准的C#系列将满足最(在性能项)用于这一目的?我知道, LINQ 有一个相交扩展方法,找出两个列表/数组的交集,但我的重点是性能大O符号的条款。



如果我要找出两个集合的交集是什么呢?


解决方案

好吧,如果你使用LINQ的相交方法,将建立一个 HashSet的第二序列,然后检查针对它的第一个序列中的每个元素。所以这是O(M + N)...你可以使用 foo.Intersect(巴)。任何()来得到早了。



当然,如果你存储一个(或)的设置的HashSet< T> ,开始时,你可以遍历另一种检查每一步遏制。 。你仍然需要建立集下手,虽然



从根本上说你有任何你做一个O(M + N)的问题 - 你不是要得到比便宜(有总是的,你必须看看每一个元素的可能性),如果你的哈希码是合理的,你应该能够轻松实现的复杂性。当然,一些解决方案可以提供更好的持续性因素比别人......但这是性能,而不是复杂;)



编辑:正如在评论中指出,这里还有<一个HREF =htt​​p://msdn.microsoft.com/en-us/library/dd412095%28v=vs.110%29.aspx> 的ISet< T> .Overlaps - 如果你已经有了可以是一个静态类型的ISet<的设置; T> 或一个具体的实现,要求重叠使得它更清晰的你在做什么。如果的您的套均的静态类型为的ISet< T> ,使用 larger.Overlaps(小)(其中较大和较小的是集合的大小)作为我期望重叠的实现遍历的参数,并检查每个元素对你叫它上的一组内容。


Given two sets of values, I have to find our whether there is any common element among them or not i.e whether their intersection is null or not.

Which of the standard C# collection will suit best(in term of performance) for this purpose? I know that linq has a Intersect extension method to find out the intersection of two list/arrays but my focus is on performance in terms of Big-O notation.

And what if I have to find out the intersection of two sets as well?

解决方案

Well, if you use LINQ's Intersect method it will build up a HashSet of the second sequence, and then check each element of the first sequence against it. So it's O(M+N)... and you can use foo.Intersect(bar).Any() to get an early-out.

Of course, if you store one (either) set in a HashSet<T> to start with, you can just iterate over the other one checking for containment on each step. You'd still need to build the set to start with though.

Fundamentally you've got an O(M+N) problem whatever you do - you're not going to get cheaper than that (there's always the possibility that you'll have to look at every element) and if your hash codes are reasonable, you should be able to achieve that complexity easily. Of course, some solutions may give better constant factors than others... but that's performance rather than complexity ;)

EDIT: As noted in the comments, there's also ISet<T>.Overlaps - if you've already got either set with a static type of ISet<T> or a concrete implementation, calling Overlaps makes it clearer what you're doing. If both of your sets are statically typed as ISet<T>, use larger.Overlaps(smaller) (where larger and smaller are in terms of the size of the set) as I'd expect an implementation of Overlaps to iterate over the argument and check each element against contents of the set you call it on.

这篇关于在最优化的方式的两组相交的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆