匹配两个列表(或数组)中的项目 [英] matching items from two lists (or arrays)

查看:88
本文介绍了匹配两个列表(或数组)中的项目的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的工作出现问题,希望可以归结为以下几点:我有两个List<int>,并且我想查看ListA中的任何int是否等于任何ListB中. (如果可以简化生活,它们可以是数组,但是我认为List<>具有一些内置的魔术可能会有所帮助.)而且我确定这是LINQ友好的问题,但是我在这里使用2.0.

I'm having a problem with my work that hopefully reduces to the following: I have two List<int>s, and I want to see if any of the ints in ListA are equal to any int in ListB. (They can be arrays if that makes life easier, but I think List<> has some built-in magic that might help.) And I'm sure this is a LINQ-friendly problem, but I'm working in 2.0 here.

到目前为止,我的解决方案是通过ListA到foreach,然后通过ListB到foreach

My solution so far has been to foreach through ListA, and then foreach through ListB,

foreach (int a in ListA)
{
    foreach (int b in ListB)
    {
        if (a == b)
        {
            return true;
        }
    }
}

当它们分别长为三个时,实际上是很漂亮的,但是现在它们长为200,并且经常不匹配,因此我们得到了N ^ 2比较的最坏情况.即使进行40,000次比较也很快,但是我想我可能会遗漏一些东西,因为对于这个特定问题,N ^ 2似乎很幼稚.

which was actually pretty slick when they were each three items long, but now they're 200 long and they frequently don't match, so we get the worst-case of N^2 comparisons. Even 40,000 comparisons go by pretty fast, but I think I might be missing something, since N^2 seems pretty naive for this particular problem.

谢谢!

推荐答案

使用 LINQ ,这很简单,因为您可以调用 Enumerable上的> Intersect扩展方法 class 为您提供两个数组的交集:

With LINQ, this is trivial, as you can call the Intersect extension method on the Enumerable class to give you the set intersection of the two arrays:

var intersection = ListA.Intersect(ListB);

但是,这是 set 交集,这意味着如果ListAListB中没有唯一值,则不会获得任何副本.换句话说,如果您具有以下条件:

However, this is the set intersection, meaning if ListA and ListB don't have unique values in it, you won't get any copies. In other words if you have the following:

var ListA = new [] { 0, 0, 1, 2, 3 };
var ListB = new [] { 0, 0, 0, 2 };

然后ListA.Intersect(ListB)产生:

{ 0, 2 }

如果您期望:

{ 0, 0, 2 }

然后,当您扫描两个列表时,您将必须自己维护这些项目的数量并减少/减少.

Then you're going to have to maintain a count of the items yourself and yield/decrement as you scan the two lists.

首先,您要收集 Dictionary<TKey, int> 以及各个项目的列表:

First, you'd want to collect a Dictionary<TKey, int> with the lists of the individual items:

var countsOfA = ListA.GroupBy(i => i).ToDictionary(g => g.Key, g => g.Count());

在这里,您可以扫描ListB并将其放在列表中,当您遇到countsOfA中的项目时:

From there, you can scan ListB and place that in a list when you come across an item in countsOfA:

// The items that match.
IList<int> matched = new List<int>();

// Scan 
foreach (int b in ListB)
{
    // The count.
    int count;

    // If the item is found in a.
    if (countsOfA.TryGetValue(b, out count))
    {
        // This is positive.
        Debug.Assert(count > 0);

        // Add the item to the list.
        matched.Add(b);

        // Decrement the count.  If
        // 0, remove.
        if (--count == 0) countsOfA.Remove(b);
    }
}

您可以将其包装在延缓执行的扩展方法中,如下所示:

You can wrap this up in an extension method that defers execution like so:

public static IEnumerable<T> MultisetIntersect(this IEnumerable<T> first,
    IEnumerable<T> second)
{
    // Call the overload with the default comparer.
    return first.MultisetIntersect(second, EqualityComparer<T>.Default);
}

public static IEnumerable<T> MultisetIntersect(this IEnumerable<T> first,
    IEnumerable<T> second, IEqualityComparer<T> comparer)
{
    // Validate parameters.  Do this separately so check
    // is performed immediately, and not when execution
    // takes place.
    if (first == null) throw new ArgumentNullException("first");
    if (second == null) throw new ArgumentNullException("second");
    if (comparer == null) throw new ArgumentNullException("comparer");

    // Defer execution on the internal
    // instance.
    return first.MultisetIntersectImplementation(second, comparer);
}

private static IEnumerable<T> MultisetIntersectImplementation(
    this IEnumerable<T> first, IEnumerable<T> second, 
    IEqualityComparer<T> comparer)
{
    // Validate parameters.
    Debug.Assert(first != null);
    Debug.Assert(second != null);
    Debug.Assert(comparer != null);

    // Get the dictionary of the first.
    IDictionary<T, long> counts = first.GroupBy(t => t, comparer).
        ToDictionary(g => g.Key, g.LongCount(), comparer);

    // Scan 
    foreach (T t in second)
    {
        // The count.
        long count;

        // If the item is found in a.
        if (counts.TryGetValue(t, out count))
        {
            // This is positive.
            Debug.Assert(count > 0);

            // Yield the item.
            yield return t;

            // Decrement the count.  If
            // 0, remove.
            if (--count == 0) counts.Remove(t);
        }
    }
}

请注意,这两种方法都是(如果我在这里使用Big-O符号表示歉意)O(N + M)其中,N是第一个数组中的项数,而M是第一个数组中的项数第二个数组中的项目.您只需扫描每个列表一次,并且假定获取哈希码并在哈希码上执行查找是O(1)(恒定)操作.

Note that both of these approaches are (and I apologize if I'm butchering Big-O notation here) O(N + M) where N is the number of items in the first array, and M is the number of items in the second array. You have to scan each list only once, and it's assumed that getting the hash codes and performing lookups on the hash codes is an O(1) (constant) operation.

这篇关于匹配两个列表(或数组)中的项目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆