无论项目的顺序如何,比较两个集合的相等性 [英] Comparing two collections for equality irrespective of the order of items in them

查看:32
本文介绍了无论项目的顺序如何,比较两个集合的相等性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想比较两个集合(在 C# 中),但我不确定有效实现它的最佳方法.

I would like to compare two collections (in C#), but I'm not sure of the best way to implement this efficiently.

我已经阅读了关于 Enumerable.SequenceEqual,但这并不是我想要的.

I've read the other thread about Enumerable.SequenceEqual, but it's not exactly what I'm looking for.

在我的例子中,如果两个集合都包含相同的项目(无论顺序如何),则它们是相等的.

In my case, two collections would be equal if they both contain the same items (no matter the order).

示例:

collection1 = {1, 2, 3, 4};
collection2 = {2, 4, 1, 3};

collection1 == collection2; // true

我通常做的是遍历一个集合的每一项,看看它是否存在于另一个集合中,然后遍历另一个集合的每一项,看看它是否存在于第一个集合中.(我从比较长度开始).

What I usually do is to loop through each item of one collection and see if it exists in the other collection, then loop through each item of the other collection and see if it exists in the first collection. (I start by comparing the lengths).

if (collection1.Count != collection2.Count)
    return false; // the collections are not equal

foreach (Item item in collection1)
{
    if (!collection2.Contains(item))
        return false; // the collections are not equal
}

foreach (Item item in collection2)
{
    if (!collection1.Contains(item))
        return false; // the collections are not equal
}

return true; // the collections are equal

然而,这并不完全正确,而且这可能不是比较两个集合是否相等的最有效方法.

However, this is not entirely correct, and it's probably not the most efficient way to do compare two collections for equality.

我能想到的一个错误的例子是:

An example I can think of that would be wrong is:

collection1 = {1, 2, 3, 3, 4}
collection2 = {1, 2, 2, 3, 4}

这与我的实现相同.我是否应该只计算找到每个项目的次数并确保两个集合中的计数相等?

Which would be equal with my implementation. Should I just count the number of times each item is found and make sure the counts are equal in both collections?

示例使用某种 C#(我们称其为伪 C#),但是用您希望的任何语言给出答案,都没有关系.

The examples are in some sort of C# (let's call it pseudo-C#), but give your answer in whatever language you wish, it does not matter.

注意:为了简单起见,我在示例中使用了整数,但我也希望能够使用引用类型的对象(它们不能正确地作为键,因为只有对象的引用是比较,不是内容)

Note: I used integers in the examples for simplicity, but I want to be able to use reference-type objects too (they do not behave correctly as keys because only the reference of the object is compared, not the content).

推荐答案

事实证明,微软已经在其测试框架中包含了这一点:CollectionAssert.AreEquivalent

It turns out Microsoft already has this covered in its testing framework: CollectionAssert.AreEquivalent

备注

两个集合是等价的,如果它们具有相同的元素数量,但以任何顺序.元素如果它们的值相等,则相等,如果它们引用同一个对象,则不会.

Two collections are equivalent if they have the same elements in the same quantity, but in any order. Elements are equal if their values are equal, not if they refer to the same object.

使用反射器,我修改了 AreEquivalent() 后面的代码以创建相应的相等比较器.它比现有答案更完整,因为它考虑了空值,实现了 IEqualityComparer 并具有一些效率和边缘情况检查.另外,它是 Microsoft :)

Using reflector, I modified the code behind AreEquivalent() to create a corresponding equality comparer. It is more complete than existing answers, since it takes nulls into account, implements IEqualityComparer and has some efficiency and edge case checks. plus, it's Microsoft :)

public class MultiSetComparer<T> : IEqualityComparer<IEnumerable<T>>
{
    private readonly IEqualityComparer<T> m_comparer;
    public MultiSetComparer(IEqualityComparer<T> comparer = null)
    {
        m_comparer = comparer ?? EqualityComparer<T>.Default;
    }

    public bool Equals(IEnumerable<T> first, IEnumerable<T> second)
    {
        if (first == null)
            return second == null;

        if (second == null)
            return false;

        if (ReferenceEquals(first, second))
            return true;

        if (first is ICollection<T> firstCollection && second is ICollection<T> secondCollection)
        {
            if (firstCollection.Count != secondCollection.Count)
                return false;

            if (firstCollection.Count == 0)
                return true;
        }

        return !HaveMismatchedElement(first, second);
    }

    private bool HaveMismatchedElement(IEnumerable<T> first, IEnumerable<T> second)
    {
        int firstNullCount;
        int secondNullCount;

        var firstElementCounts = GetElementCounts(first, out firstNullCount);
        var secondElementCounts = GetElementCounts(second, out secondNullCount);

        if (firstNullCount != secondNullCount || firstElementCounts.Count != secondElementCounts.Count)
            return true;

        foreach (var kvp in firstElementCounts)
        {
            var firstElementCount = kvp.Value;
            int secondElementCount;
            secondElementCounts.TryGetValue(kvp.Key, out secondElementCount);

            if (firstElementCount != secondElementCount)
                return true;
        }

        return false;
    }

    private Dictionary<T, int> GetElementCounts(IEnumerable<T> enumerable, out int nullCount)
    {
        var dictionary = new Dictionary<T, int>(m_comparer);
        nullCount = 0;

        foreach (T element in enumerable)
        {
            if (element == null)
            {
                nullCount++;
            }
            else
            {
                int num;
                dictionary.TryGetValue(element, out num);
                num++;
                dictionary[element] = num;
            }
        }

        return dictionary;
    }

    public int GetHashCode(IEnumerable<T> enumerable)
    {
        if (enumerable == null) throw new 
            ArgumentNullException(nameof(enumerable));

        int hash = 17;

        foreach (T val in enumerable)
            hash ^= (val == null ? 42 : m_comparer.GetHashCode(val));

        return hash;
    }
}

示例用法:

var set = new HashSet<IEnumerable<int>>(new[] {new[]{1,2,3}}, new MultiSetComparer<int>());
Console.WriteLine(set.Contains(new [] {3,2,1})); //true
Console.WriteLine(set.Contains(new [] {1, 2, 3, 3})); //false

或者如果您只想直接比较两个集合:

Or if you just want to compare two collections directly:

var comp = new MultiSetComparer<string>();
Console.WriteLine(comp.Equals(new[] {"a","b","c"}, new[] {"a","c","b"})); //true
Console.WriteLine(comp.Equals(new[] {"a","b","c"}, new[] {"a","b"})); //false

最后,您可以使用您选择的相等比较器:

Finally, you can use your an equality comparer of your choice:

var strcomp = new MultiSetComparer<string>(StringComparer.OrdinalIgnoreCase);
Console.WriteLine(strcomp.Equals(new[] {"a", "b"}, new []{"B", "A"})); //true

这篇关于无论项目的顺序如何,比较两个集合的相等性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆