如何最有效地测试两个数组是否包含 C# 中的等效项 [英] How to most efficiently test if two arrays contain equivalent items in C#

查看:22
本文介绍了如何最有效地测试两个数组是否包含 C# 中的等效项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数组,我想知道它们是否包含相同的项目.Equals(object obj) 不起作用,因为数组是引用类型.我已在下方发布了我的尝试,但由于我确定这是一项常见任务,因此我想知道是否有更好的测试.

 public bool ContainsEquivalentSequence(T[] array1, T[] array2){bool a1IsNullOrEmpty = ReferenceEquals(array1, null) ||array1.Length == 0;bool a2IsNullOrEmpty = ReferenceEquals(array2, null) ||array2.Length == 0;if (a1IsNullOrEmpty) 返回 a2IsNullOrEmpty;if (a2IsNullOrEmpty || array1.Length != array2.Length) 返回假;for (int i = 0; i 

更新 - System.Linq.Enumerable.SequenceEqual 不是更好

我反映了源代码,它在执行循环之前不比较长度.这是有道理的,因为该方法通常是为 IEnumerable 而设计的,而不是为 T[] 设计的.

 public static bool SequenceEqual(这个IEnumerable第一,IEnumerable第二,IEqualityComparer比较器){如果(比较器 == 空){comparer = EqualityComparer.Default;}如果(第一个 == 空){throw Error.ArgumentNull("first");}如果(第二个 == 空){throw Error.ArgumentNull("second");}使用 (IEnumerator enumerator = first.GetEnumerator()){使用 (IEnumerator enumerator2 = second.GetEnumerator()){while (enumerator.MoveNext()){if (!enumerator2.MoveNext() || !comparer.Equals(enumerator.Current, enumerator2.Current)){返回假;}}if (enumerator2.MoveNext()){返回假;}}}返回真;}

解决方案

我使用 AnyContainsAll 做了一些测试和 SequenceEqual 然后我选择了最好的 3 个.

不同的输入有不同的结果...

两个大小为 100 的相同数组:SequenceEqual 更快

[ SequenceEqual: 00:00:00.027 ]*[包含EqSeq:00:00:00.046][平行:00:00:00.281]

两个大小为 1000 的相同数组:SequenceEqual 更快

[ SequenceEqual: 00:00:00.240 ]*[包含EqSeq:00:00:00.361][平行:00:00:00.491]

两个大小为 10000 的相同数组:Parallel 更快

[ SequenceEqual: 00:00:02.357 ][包含EqSeq:00:00:03.341][平行:00:00:01.688]*

两个大小为 50000 的相同数组:Parallel 踢屁股

[ SequenceEqual: 00:00:11.824 ][包含EqSeq:00:00:17.206][平行:00:00:06.811]*

在位置 200 处有一个差异的两个数组:SequenceEqual 更快

[ SequenceEqual: 00:00:00.050 ]*[包含EqSeq:00:00:00.075][平行:00:00:00.332]

在位置 0 处有一个差异的两个数组:ContainsEqSeqSequenceEqual 更快

[ SequenceEqual: 00:00:00.002 ]*[包含EqSeq:00:00:00.001]*[平行:00:00:00.211]

在位置 999 处有一个差异的两个数组:SequenceEqual 更快

[ SequenceEqual: 00:00:00.237 ]*[包含EqSeq:00:00:00.330][平行:00:00:00.691]

在位置 9999 处有一个差异的两个数组:Parallel kick ass

[ SequenceEqual: 00:00:02.386 ][包含EqSeq:00:00:03.417][平行:00:00:01.614]*

SequenceEqual 的代码是

a1.SequenceEqual(a2)

ContainsEqSeq 的代码就是你的方法.

Parallel 的代码是

bool a1IsNullOrEmpty = ReferenceEquals(a1, null) ||a1.Length == 0;bool a2IsNullOrEmpty = ReferenceEquals(a2, null) ||a2.Length == 0;if (a1IsNullOrEmpty) 返回 a2IsNullOrEmpty;if (a2IsNullOrEmpty || a1.Length != a2.Length) 返回假;var areEqual = true;Parallel.ForEach(a1,(i, s, x) =>{如果 (a1[x] != a2[x]){areEqual = false;s.停止();}});回报是平等的;

我想说最好的取决于你的投入.

如果您将使用大型数组(例如 10000+),我会说 Parallel 是最佳选择,它只会在开始时出现差异时丢失.

对于其他情况,SequenceEqual 可能是最好的,我只用 int[] 进行了测试,但我相信它对于复杂类型也能很快.>

但请记住,结果会因输入而异.

I have two arrays and I want to know if they contain the same items. Equals(object obj) doesn't work because an array is a reference type. I have posted my attempt below, but since I'm sure this is a common task I'd like to know if there is a better test.

    public bool ContainsEquivalentSequence<T>(T[] array1, T[] array2)
    {
        bool a1IsNullOrEmpty = ReferenceEquals(array1, null) || array1.Length == 0;
        bool a2IsNullOrEmpty = ReferenceEquals(array2, null) || array2.Length == 0;
        if (a1IsNullOrEmpty) return a2IsNullOrEmpty;
        if (a2IsNullOrEmpty || array1.Length != array2.Length) return false;
        for (int i = 0; i < array1.Length; i++)
            if (!Equals(array1[i], array2[i]))
                return false;
        return true;
    }

Update - System.Linq.Enumerable.SequenceEqual is not better

I reflected the source and it does not compare the length prior to executing the loop. This makes sense since the method is designed generally for an IEnumerable<T>, not for a T[].

    public static bool SequenceEqual<TSource>(this IEnumerable<TSource> first, IEnumerable<TSource> second, IEqualityComparer<TSource> comparer)
    {
        if (comparer == null)
        {
            comparer = EqualityComparer<TSource>.Default;
        }
        if (first == null)
        {
            throw Error.ArgumentNull("first");
        }
        if (second == null)
        {
            throw Error.ArgumentNull("second");
        }
        using (IEnumerator<TSource> enumerator = first.GetEnumerator())
        {
            using (IEnumerator<TSource> enumerator2 = second.GetEnumerator())
            {
                while (enumerator.MoveNext())
                {
                    if (!enumerator2.MoveNext() || !comparer.Equals(enumerator.Current, enumerator2.Current))
                    {
                        return false;
                    }
                }
                if (enumerator2.MoveNext())
                {
                    return false;
                }
            }
        }
        return true;
    }

解决方案

I've done some tests using Any, Contains, All and SequenceEqual then I picked the best 3.

There are different results for different inputs...

Two identical arrays of size 100: SequenceEqual was faster

[     SequenceEqual: 00:00:00.027   ]*
[     ContainsEqSeq: 00:00:00.046   ]
[          Parallel: 00:00:00.281   ]

Two identical arrays of size 1000: SequenceEqual was faster

[     SequenceEqual: 00:00:00.240   ]*
[     ContainsEqSeq: 00:00:00.361   ]
[          Parallel: 00:00:00.491   ]

Two identical arrays of size 10000: Parallel was faster

[     SequenceEqual: 00:00:02.357   ]
[     ContainsEqSeq: 00:00:03.341   ]
[          Parallel: 00:00:01.688   ]*

Two identical arrays of size 50000: Parallel kick ass

[     SequenceEqual: 00:00:11.824   ]
[     ContainsEqSeq: 00:00:17.206   ]
[          Parallel: 00:00:06.811   ]*

Two arrays with one difference at position 200: SequenceEqual was faster

[     SequenceEqual: 00:00:00.050   ]*
[     ContainsEqSeq: 00:00:00.075   ]
[          Parallel: 00:00:00.332   ]

Two arrays with one difference at position 0: ContainsEqSeq and SequenceEqual were faster

[     SequenceEqual: 00:00:00.002   ]*
[     ContainsEqSeq: 00:00:00.001   ]*
[          Parallel: 00:00:00.211   ]

Two arrays with one difference at position 999: SequenceEqual was faster

[     SequenceEqual: 00:00:00.237   ]*
[     ContainsEqSeq: 00:00:00.330   ]
[          Parallel: 00:00:00.691   ]

Two arrays with one difference at position 9999: Parallel kick ass

[     SequenceEqual: 00:00:02.386   ]
[     ContainsEqSeq: 00:00:03.417   ]
[          Parallel: 00:00:01.614   ]*

The code for SequenceEqual is

a1.SequenceEqual(a2)

The code for ContainsEqSeq is your method.

The code for Parallel is

bool a1IsNullOrEmpty = ReferenceEquals(a1, null) || a1.Length == 0;
bool a2IsNullOrEmpty = ReferenceEquals(a2, null) || a2.Length == 0;
if (a1IsNullOrEmpty) return a2IsNullOrEmpty;
if (a2IsNullOrEmpty || a1.Length != a2.Length) return false;

var areEqual = true;
Parallel.ForEach(a1,
    (i, s, x) =>
    {
        if (a1[x] != a2[x])
        {
            areEqual = false;
            s.Stop();
        }
    });

return areEqual;

I would say that the best one depends on what your input will be.

If you will work with huge arrays (like 10000+) I would say Parallel is the best choice, it only loses when there is a difference on the beginning.

For other cases SequenceEqual might be the best one, I only tested with int[], but I believe it can be fast with complex types as well.

But remember, results will vary accordingly to the input.

这篇关于如何最有效地测试两个数组是否包含 C# 中的等效项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆