C#检查是否值中的一个集合包含另一 [英] C# check if one collection of values contains another

查看:626
本文介绍了C#检查是否值中的一个集合包含另一的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有两个集合如下:

Suppose I have two collections as follows:

Collection1:
A1
A1
M1
M2

Collection1: "A1" "A1" "M1" "M2"

Collection2:
M2
M3
M1
A1
A1
A2

Collection2: "M2" "M3" "M1" "A1" "A1" "A2"

所有值都是字符串值。我想知道,如果在Collection1所有元素都包含在Collection2,但我的顺序不能保证和一组可能有相同的值的多个条目。在这种情况下,Collection2确实含有Collection1因为Collection2具有两个A1的,M1和M2。即使世界最显而易见的方法:既排序集合和突然离开价值观,因为我觉得比赛,但我想知道是否有这样做更快更有效的方式。再次与最初的藏品我有顺序没有保证或给定值多少次出现

all the values are string values. I want to know if all the elements in Collection1 are contained in Collection2, but I have no guarantee on the order and a set may have multiple entries with the same value. In this case, Collection2 does contain Collection1 because Collection2 has two A1's, M1 and M2. Theres the obvious way: sorting both collections and popping off values as i find matches, but I was wondering if there's a faster more efficient way to do this. Again with the initial collections I have no guarantee on the order or how many times a given value will appear

编辑:更改后的设定,以收集只是为了清理这些AREN T台,因为他们可以包含重复值

Changed set to collection just to clear up that these aren't sets as they can contain duplicate values

推荐答案

是的,有一个更快的方法,只要你不是空间受限。 (请参见空间/时间权衡。)

Yes, there is a faster way, provided you're not space-constrained. (See space/time tradeoff.)

算法:

就在SET2所有元素插入到一个哈希表(在C#3.5,这是一个的 HashSet的<串GT; ),然后通过设置1的所有元素,并检查他们在哈希表。这种方法速度快(Θ(M + N)的时间复杂度),但使用O(n)的空间。

Just insert all the elements in Set2 into a hashtable (in C# 3.5, that's a HashSet<string>), and then go through all the elements of Set1 and check if they're in the hashtable. This method is faster (Θ(m + n) time complexity), but uses O(n) space.

另外,只说:

bool isSuperset = new HashSet<string>(set2).IsSupersetOf(set1);






修改1:

对于那些关注重复的可能性(从而名不副实的设置)的人,这个想法可以很容易地进行扩展:

For those people concerned about the possibility of duplicates (and hence the misnomer "set"), the idea can easily be extended:

只是做一个新的词典<字符串,整数> 代表在超级列表中每个单词的计数(添加一个计数每次看时间现有字的一个实例,并为1的计数加字,如果它不在字典中),然后通过该子列表以及每个时间递减计数。如果在字典中存在的每一个字的的计数为零从来没有当你尝试减小它,然后子集,其实是子列表;否则,你有一个字的这种情况太多了(或者根本不存在的),所以它不是一个真正的子列表。

Just make a new Dictionary<string, int> representing the count of each word in the super-list (add one to the count each time you see an instance of an existing word, and add the word with a count of 1 if it's not in the dictionary), and then go through the sub-list and decrement the count each time. If every word exists in the dictionary and the count is never zero when you try to decrement it, then the subset is in fact a sub-list; otherwise, you had too many instances of a word (or it didn't exist at all), so it's not a real sub-list.

编辑2:

如果该字符串是非常大的,你很在意空间效率,以及一种算法,作品有(非常)高的概率为你的作品,然后尝试存储的的每个字符串代替。这不是技术上的保证的工作,但它不工作的概率是相当不错的低。

If the strings are very big and you're concerned about space efficiency, and an algorithm that works with (very) high probability works for you, then try storing a hash of each string instead. It's technically not guaranteed to work, but the probability of it not working is pretty darn low.

这篇关于C#检查是否值中的一个集合包含另一的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆