为正则表达式生成所有匹配项 [英] Generate all matches for regex

查看:69
本文介绍了为正则表达式生成所有匹配项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于用户选择,我想提供与给定正则表达式匹配的数字列表.正则表达式本身非常简单,它只能看起来像这样 123[0-9][0-9][4-9]34.2

For a user selection I would like to provide a list of numbers that match a given regex. The regex itself is very simple it only can look like this 123[0-9][0-9] or [4-9]34.2

我发现票价(https://github.com/moodmosaic/Fare) 正在以某种方式完成这项工作.看下面的例子:

I found out that Fare (https://github.com/moodmosaic/Fare) is doing the job somehow. see the following example:

string pattern = "123[0-9][0-9]";
var xeger = new Xeger(pattern);
var match = xeger.Generate(); //match is e.g. "12349"

不幸的是,Fare-lib 只给了我一个可能的匹配,而不是字符串可以包含的所有 100 种可能的组合.

Unfortunately the Fare-lib is only giving me one possible match but not all 100 possible combination the string can have.

如果您认为在这种情况下正则表达式不是一个好主意,而是建议使用 for 循环实现来替换字符,我也打算这样做,但目前我不知道如何?也许递归函数会很聪明?

If you think regex is not a good idea in this case and would rather suggest a for-loop implementation that replaces chars I'm also going with that but currently I don't know how? Maybe a recursive function would be clever?

推荐答案

我宁愿创建自己的实现而不是使用库.以下代码执行您想要实现的目标.

I'd rather create my own implementation than using a library. The following code does what you want to achieve.

 private static Regex regexRegex = new Regex("\\[(?<From>\\d)-(?<To>\\d)]", RegexOptions.Compiled);

    private static IEnumerable<string> GetStringsForRegex(string pattern)
    {
        var strings = Enumerable.Repeat("", 1);
        var lastIndex = 0;
        foreach (Match m in regexRegex.Matches(pattern))
        {
            if (m.Index > lastIndex)
            {
                var capturedLastIndex = lastIndex;
                strings = strings.Select(s => s + pattern.Substring(capturedLastIndex, m.Index - capturedLastIndex));
            }
            int from = int.Parse(m.Groups["From"].Value);
            int to = int.Parse(m.Groups["To"].Value);
            if (from > to)
            {
                throw new InvalidOperationException();
            }
            strings = strings.SelectMany(s => Enumerable.Range(from, to - from + 1), (s, i) => s + i.ToString());
            lastIndex = m.Index + m.Length;
        }
        if (lastIndex < pattern.Length)
        {
             var capturedLastIndex = lastIndex;
             strings = strings.Select(s => s + pattern.Substring(capturedLastIndex));
        }
        return strings;
    }

基本上,代码构建了正则表达式模式的所有解决方案.它甚至按字母顺序计算它们.

Basically, the code constructs all solutions for the regex pattern. It even computes them in alphabetical order.

注意 capturedLastIndex 变量.它是必需的,否则编译器会捕获 lastIndex 变量,从而导致不正确的结果.

Beware of the capturedLastIndex variable. It is required as the compiler would otherwise capture the lastIndex variable, leading to incorrect results.

这篇关于为正则表达式生成所有匹配项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆