查找序列中的IEnumerable< T>使用LINQ [英] Find sequence in IEnumerable<T> using Linq

查看:264
本文介绍了查找序列中的IEnumerable< T>使用LINQ的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

什么是最有效的方式来寻找在序列中的的IEnumerable< T> 使用LINQ

What is the most efficient way to find a sequence within a IEnumerable<T> using LINQ

我希望能够创建一个扩展方法,它允许下面的调用:

I want to be able to create an extension method which allows the following call:

int startIndex = largeSequence.FindSequence(subSequence)

本场比赛必须相邻而为了。

The match must be adjacent and in order.

推荐答案

下面是一个算法,发现一个子序列中的一个实现。我所谓的方法 IndexOfSequence ,因为它使意图更加明确,类似于现有的的IndexOf 方法:

Here's an implementation of an algorithm that finds a subsequence in a sequence. I called the method IndexOfSequence, because it makes the intent more explicit and is similar to the existing IndexOf method:

public static class ExtensionMethods
{
    public static int IndexOfSequence<T>(this IEnumerable<T> source, IEnumerable<T> sequence)
    {
        return source.IndexOfSequence(sequence, EqualityComparer<T>.Default);
    }

    public static int IndexOfSequence<T>(this IEnumerable<T> source, IEnumerable<T> sequence, IEqualityComparer<T> comparer)
    {
        var seq = sequence.ToArray();

        int p = 0; // current position in source sequence
        int i = 0; // current position in searched sequence
        var prospects = new List<int>(); // list of prospective matches
        foreach (var item in source)
        {
            // Remove bad prospective matches
            prospects.RemoveAll(k => !comparer.Equals(item, seq[p - k]));

            // Is it the start of a prospective match ?
            if (comparer.Equals(item, seq[0]))
            {
                prospects.Add(p);
            }

            // Does current character continues partial match ?
            if (comparer.Equals(item, seq[i]))
            {
                i++;
                // Do we have a complete match ?
                if (i == seq.Length)
                {
                    // Bingo !
                    return p - seq.Length + 1;
                }
            }
            else // Mismatch
            {
                // Do we have prospective matches to fall back to ?
                if (prospects.Count > 0)
                {
                    // Yes, use the first one
                    int k = prospects[0];
                    i = p - k + 1;
                }
                else
                {
                    // No, start from beginning of searched sequence
                    i = 0;
                }
            }
            p++;
        }
        // No match
        return -1;
    }
}

我没有完全测试,所以可能仍包含错误。我只是做了著名的极端案例了一些测试,以确保我没有陷入明显的陷阱。似乎做工精细迄今为止...

I didn't fully test it, so it might still contain bugs. I just did a few tests on well-known corner cases to make sure I wasn't falling into obvious traps. Seems to work fine so far...

我觉得复杂度接近O(n)的,但我不是大O符号的专家,所以我可能是错的......至少它只枚举源序列一次,永远内部消除回去,所以它应该是相当有效。

I think the complexity is close to O(n), but I'm not an expert of Big O notation so I could be wrong... at least it only enumerates the source sequence once, whithout ever going back, so it should be reasonably efficient.

这篇关于查找序列中的IEnumerable&LT; T&GT;使用LINQ的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆