.NET对象列表中的正则表达式样式模式匹配 [英] Regex Style Pattern Matching in .NET Object Lists

查看:38
本文介绍了.NET对象列表中的正则表达式样式模式匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我收集了具有各种属性的.NET对象.可以说,这是遗传密码中的一连串染色体-尽管对象数据要比这复杂一些.我想在列表中搜索对象的预定义序列.我可以将对象定义为有限数量的感兴趣的独特类型.R,B,D,并在一个庞大的列表中,我想找到某些对象序列:

I've got a collection of .NET objects with various properties. Lets say its a chain of Chromosomes in a genetic code - although the objects data is a little more complex than that. I want to search the list for predefined sequences of objects. I can define objects as a finite number of unique types of interest. R,B,D and in a massive list I want to find certain sequences of objects:

一个大大简化的版本是:

A massively simplified version would be:

public class Chromosome {
    public ChromosomeType CromosomeType { 
       get {
        // Some logic that works out and returns the correct chromosome type

       }
    }
}

public enum ChromosomeType {
  R, B, D
}

因此,考虑到这些类型的大量集合.我想匹配某些序列

So given a large collection of these types. I want to match certain sequences

例如"R + B {3} D +"

因此,在上面的正则表达式"中,以下子序列将在列表中匹配:$$$$ BBBDD

So in the "regex" above, the following subsequence would be matched in a list: RRRBBBDD

我需要能够从很长的对象列表中返回所有匹配项.

I need to be able to return all matches from a very long list of Objects.

显然正则表达式非常适合此操作,但实际上我没有字符串,我有对象集合.

Clearly regex is perfect for this, but I don't actually have strings, I've got collections of objects.

搜索对象集合以获取预定义序列的最佳方法是什么?

Whats the best way to search a collection of objects for predefined sequences?

更新

最后,我采用了柯林的解决方案.效果很好.我对其进行了更新,以便能够处理多个匹配项,并使用数组以使其尽可能快

Colin's solution is the one I went with in the end. It works great. I updated it to be able to handle multiple matches, and to use Arrays in order to be as fast as possible

这是最终的解决方案:

    public static class ChromosomesExtensions
    {
        public static IEnumerable<Chromosome[]> FindBySequence(this Chromosome[] chromosomes, string patternRegex)
        {
            var sequenceString
                = String.Join(
                    String.Empty, //no separator
                    (
                        from c in chromosomes
                        select c.CromosomeType.ToString()
                    )
                );
            MatchCollection matches = Regex.Matches(sequenceString, patternRegex);

            foreach (Match match in matches)
            {
                Chromosome[] subset = new Chromosome[match.Value.Length];

                var j = 0;
                for (var i = match.Index; i < match.Index + match.Length; i++)
                {
                    subset[j++] = chromosomes[i];
                }
                yield return subset;
            }
        }
    }

    [TestFixture]
    public class TestClass
    {
        [Test]
        public void TestMethod()
        {
            var chromosomes =
                new[]
                {
                    new Chromosome(){ CromosomeType = ChromosomeType.D, Id = 1},
                    new Chromosome(){ CromosomeType = ChromosomeType.R, Id = 2 },
                    new Chromosome(){ CromosomeType = ChromosomeType.R, Id = 3 },
                    new Chromosome(){ CromosomeType = ChromosomeType.B, Id = 4 },
                    new Chromosome(){ CromosomeType = ChromosomeType.B, Id = 5 },
                    new Chromosome(){ CromosomeType = ChromosomeType.B, Id = 6 },
                    new Chromosome(){ CromosomeType = ChromosomeType.D, Id = 7 },
                    new Chromosome(){ CromosomeType = ChromosomeType.D, Id = 8 },
                    new Chromosome(){ CromosomeType = ChromosomeType.B, Id = 9 },
                    new Chromosome(){ CromosomeType = ChromosomeType.R, Id = 10 },
                    new Chromosome(){ CromosomeType = ChromosomeType.R, Id = 11 },
                    new Chromosome(){ CromosomeType = ChromosomeType.B, Id = 12 },
                    new Chromosome(){ CromosomeType = ChromosomeType.B, Id = 13 },
                    new Chromosome(){ CromosomeType = ChromosomeType.B, Id = 14 },
                    new Chromosome(){ CromosomeType = ChromosomeType.D, Id = 15 },
                    new Chromosome(){ CromosomeType = ChromosomeType.D, Id = 16 },
                    new Chromosome(){ CromosomeType = ChromosomeType.R, Id = 17 },
                    new Chromosome(){ CromosomeType = ChromosomeType.R, Id = 18 },
                    new Chromosome(){ CromosomeType = ChromosomeType.B, Id = 19 },
                    new Chromosome(){ CromosomeType = ChromosomeType.B, Id = 20 },
                    new Chromosome(){ CromosomeType = ChromosomeType.B, Id = 21 },
                    new Chromosome(){ CromosomeType = ChromosomeType.D, Id = 22 },
                    new Chromosome(){ CromosomeType = ChromosomeType.D, Id = 23 },
                };

            var matchIndex = 0;
            foreach (Chromosome[] match in chromosomes.FindBySequence("R+B{3}D+"))
            {
                Console.WriteLine($"Match {++matchIndex}");
                var result = new String(match.SelectMany(x => string.Join("", $"id: {x.Id} Type: {x.CromosomeType.ToString()}\n")).ToArray());
                Console.WriteLine(result);
            }

        }
    }

输出:

    Match 1
id: 2 Type: R
id: 3 Type: R
id: 4 Type: B
id: 5 Type: B
id: 6 Type: B
id: 7 Type: D
id: 8 Type: D

Match 2
id: 10 Type: R
id: 11 Type: R
id: 12 Type: B
id: 13 Type: B
id: 14 Type: B
id: 15 Type: D
id: 16 Type: D

Match 3
id: 17 Type: R
id: 18 Type: R
id: 19 Type: B
id: 20 Type: B
id: 21 Type: B
id: 22 Type: D
id: 23 Type: D

推荐答案

使用扩展方法(实际上支持通过正则表达式进行搜索)的一种简单,干净的方法.

A simple, clean way using extension methods (that actually supports searching via Regex).

课程:

public static class ChromosomesExtensions
{
    public static IEnumerable<Chromosome> FindBySequence(this IEnumerable<Chromosome> chromosomes, string patternRegex)
    {
        var sequenceString
            = String.Join(
                String.Empty, //no separator
                (
                    from c in chromosomes
                    select c.CromosomeType.ToString()
                )
            );
        var match = Regex.Match(sequenceString, patternRegex);
        //returns empty if no match is found
        return chromosomes.ToList().GetRange(sequenceString.IndexOf(match.Value), match.Value.Length);
    }
}

用法:

var chromosomes =
    new[]
    {
        new Chromosome(){ CromosomeType = ChromosomeType.D },
        new Chromosome(){ CromosomeType = ChromosomeType.R },
        new Chromosome(){ CromosomeType = ChromosomeType.R },
        new Chromosome(){ CromosomeType = ChromosomeType.B },
        new Chromosome(){ CromosomeType = ChromosomeType.B },
        new Chromosome(){ CromosomeType = ChromosomeType.B },
        new Chromosome(){ CromosomeType = ChromosomeType.D },
        new Chromosome(){ CromosomeType = ChromosomeType.D },
        new Chromosome(){ CromosomeType = ChromosomeType.B },
    };

var queryResult = chromosomes.FindBySequence("R+B{3}D+");

这篇关于.NET对象列表中的正则表达式样式模式匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆