比较两个集合的相等性,而不考虑它们中项目的顺序 [英] Comparing two collections for equality irrespective of the order of items in them

查看:116
本文介绍了比较两个集合的相等性,而不考虑它们中项目的顺序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想比较两个集合(在C#中),但我不确定最有效的实现方式。



我读过关于 Enumerable.SequenceEqual 的其他线程,但它不是我正在寻找的。

在我的情况下,两个集合将是相等的,如果他们都包含相同的项目(无论顺序)。



示例:

  collection1 = {1,2,3, 4}; 
collection2 = {2,4,1,3};

collection1 == collection2; // true

我通常做的是循环访问一个集合的每个项目,在其他集合中,然后循环通过其他集合的每个项目,并查看它是否存在于第一个集合中。 (我从比较长度开始)。

  if(collection1.Count!= collection2.Count)
return false ; //集合不相等

foreach(collection1中的项目项)
{
if(!collection2.Contains(item))
return false; //集合不相等
}

foreach(collection2中的项目项)
{
if(!collection1.Contains(item))
return false; //集合不相等
}

return true; //集合是相等的

然而,这不是完全正确的,它可能不是最高效的方法来比较两个集合的平等性。



我可以想到的错误的例子是:

  collection1 = {1,2,3,3,4} 
collection2 = {1,2,2,3,4}

这将与我的实现相同。我应该只计算每个项目的次数,并确保两个集合中的计数相等?






例子是某种C#(我们称之为pseudo-C#),但是用任何你想要的语言来给出答案,没关系。



注意:为了简单起见,我在示例中使用了整数,但是我也希望能够使用引用类型的对象(因为只比较对象的引用,而不是内容)。

解决方案

结果是微软已经在测试框架中覆盖了这一点: CollectionAssert.AreEequivalent


注释



两个集合是等效的,如果他们
在相同的
数量有相同的元素,但是以任何顺序。元素
在它们的值相等时是相等的,
如果它们指向相同的对象则不是。


反射器,我修改后面的代码AreEquivalent()创建一个对应的等于比较器。它比现有的答案更完整,因为它考虑了null,实现IEqualityComparer,并有一些效率和边缘情况检查。 :

  public class MultiSetComparer< T> :IEqualityComparer< IEnumerable< T> 
{
public bool Equals(IEnumerable< T> first,IEnumerable< T> second)
{
if(first == null)
return second == null ;

if(second == null)
return false;

if(ReferenceEquals(first,second))
return true;

var firstCollection = first as ICollection< T> ;;
var secondCollection = second as ICollection< T> ;;
if(firstCollection!= null& amp; secondCollection!= null)
{
if(firstCollection.Count!= secondCollection.Count)
return false;

if(firstCollection.Count == 0)
return true;
}

return!HaveMismatchedElement(first,second);
}

private static bool HaveMismatchedElement(IEnumerable< T> first,IEnumerable< T> second)
{
int firstNullCount;
int secondNullCount;

var firstElementCounts = GetElementCounts(first,out firstNullCount);
var secondElementCounts = GetElementCounts(second,out secondNullCount);

if(firstNullCount!= secondNullCount || firstElementCounts.Count!= secondElementCounts.Count)
return true;

foreach(var firstValueCounts中的kvp)
{
var firstElementCount = kvp.Value;
int secondElementCount;
secondElementCounts.TryGetValue(kvp.Key,out secondElementCount);

if(firstElementCount!= secondElementCount)
return true;
}

return false;
}

私人静态字典< T,int> GetElementCounts(IEnumerable< T> enumerable,out int nullCount)
{
var dictionary = new Dictionary< T,int>();
nullCount = 0;

foreach(在枚举中的T元素)
{
if(element == null)
{
nullCount ++;
}
else
{
int num;
dictionary.TryGetValue(element,out num);
num ++;
dictionary [element] = num;
}
}

返回字典;
}

public int GetHashCode(IEnumerable< T> enumerable)
{
int hash = 17;

foreach(T val in enumerable.OrderBy(x => x))
hash = hash * 23 +((val == null)?42:val.GetHashCode ;

return hash;
}
}


I would like to compare two collections (in C#), but I'm not sure of the best way to implement this efficiently.

I've read the other thread about Enumerable.SequenceEqual, but it's not exactly what I'm looking for.

In my case, two collections would be equal if they both contain the same items (no matter the order).

Example:

collection1 = {1, 2, 3, 4};
collection2 = {2, 4, 1, 3};

collection1 == collection2; // true

What I usually do is to loop through each item of one collection and see if it exists in the other collection, then loop through each item of the other collection and see if it exists in the first collection. (I start by comparing the lengths).

if (collection1.Count != collection2.Count)
    return false; // the collections are not equal

foreach (Item item in collection1)
{
    if (!collection2.Contains(item))
        return false; // the collections are not equal
}

foreach (Item item in collection2)
{
    if (!collection1.Contains(item))
        return false; // the collections are not equal
}

return true; // the collections are equal

However, this is not entirely correct, and it's probably not the most efficient way to do compare two collections for equality.

An example I can think of that would be wrong is:

collection1 = {1, 2, 3, 3, 4}
collection2 = {1, 2, 2, 3, 4}

Which would be equal with my implementation. Should I just count the number of times each item is found and make sure the counts are equal in both collections?


The examples are in some sort of C# (let's call it pseudo-C#), but give your answer in whatever language you wish, it does not matter.

Note: I used integers in the examples for simplicity, but I want to be able to use reference-type objects too (they do not behave correctly as keys because only the reference of the object is compared, not the content).

解决方案

It turns out Microsoft already has this covered in its testing framework: CollectionAssert.AreEquivalent

Remarks

Two collections are equivalent if they have the same elements in the same quantity, but in any order. Elements are equal if their values are equal, not if they refer to the same object.

Using reflector, I modified the code behind AreEquivalent() to create a corresponding equality comparer. It is more complete than existing answers, since it takes nulls into account, implements IEqualityComparer and has some efficiency and edge case checks. plus, it's Microsoft :)

public class MultiSetComparer<T> : IEqualityComparer<IEnumerable<T>>
{
    public bool Equals(IEnumerable<T> first, IEnumerable<T> second)
    {
        if (first == null) 
            return second == null;

        if (second == null)
            return false;

        if (ReferenceEquals(first, second)) 
            return true;

        var firstCollection = first as ICollection<T>;
        var secondCollection = second as ICollection<T>;
        if (firstCollection != null && secondCollection != null)
        {
            if (firstCollection.Count != secondCollection.Count)
                return false;

            if (firstCollection.Count == 0) 
                return true;
        }

        return !HaveMismatchedElement(first, second);
    }

    private static bool HaveMismatchedElement(IEnumerable<T> first, IEnumerable<T> second)
    {
        int firstNullCount;
        int secondNullCount;

        var firstElementCounts = GetElementCounts(first, out firstNullCount);
        var secondElementCounts = GetElementCounts(second, out secondNullCount);

        if (firstNullCount != secondNullCount || firstElementCounts.Count != secondElementCounts.Count)
            return true;

        foreach (var kvp in firstElementCounts)
        {
            var firstElementCount = kvp.Value;
            int secondElementCount;
            secondElementCounts.TryGetValue(kvp.Key, out secondElementCount);

            if (firstElementCount != secondElementCount)
                return true;
        }

        return false;
    }

    private static Dictionary<T, int> GetElementCounts(IEnumerable<T> enumerable, out int nullCount)
    {
        var dictionary = new Dictionary<T, int>();
        nullCount = 0;

        foreach (T element in enumerable)
        {
            if (element == null)
            {
                nullCount++;
            }
            else
            {
                int num;
                dictionary.TryGetValue(element, out num);
                num++;
                dictionary[element] = num;
            }
        }

        return dictionary;
    }

    public int GetHashCode(IEnumerable<T> enumerable)
    {
        int hash = 17;

        foreach (T val in enumerable.OrderBy(x => x))
            hash = hash * 23 + ((val == null) ? 42 : val.GetHashCode());

        return hash;
    }
}

这篇关于比较两个集合的相等性,而不考虑它们中项目的顺序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆