为什么选择的顺序在正则表达式中很重要? [英] Why does the order of alternatives matter in regex?

查看:80
本文介绍了为什么选择的顺序在正则表达式中很重要?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

代码

using System;
using System.Text.RegularExpressions;

namespace RegexNoMatch {
    class Program {
        static void Main () {
            string input = "a foobar& b";
            string regex1 = "(foobar|foo)&?";
            string regex2 = "(foo|foobar)&?";
            string replace = "$1";
            Console.WriteLine(Regex.Replace(input, regex1, replace));
            Console.WriteLine(Regex.Replace(input, regex2, replace));
            Console.ReadKey();
        }
    }
}

预期输出

a foobar b
a foobar b

实际输出

a foobar b
a foobar& b

问题

当正则表达式模式中的 foo和 foobar的顺序更改时,为什么替换不起作用?

Why does replacing not work when the order of "foo" and "foobar" in regex pattern is changed? How to fix this?

推荐答案

正则表达式引擎尝试按指定的顺序匹配替代项。因此,当模式为(foo | foobar)&?时,它将立即匹配 foo 并继续尝试查找匹配项。输入字符串的下一位是 bar&。 b 不能匹配。

The regular expression engine tries to match the alternatives in the order in which they are specified. So when the pattern is (foo|foobar)&? it matches foo immediately and continues trying to find matches. The next bit of the input string is bar& b which cannot be matched.

换句话说,因为 foo 是其中的一部分 foobar (foo | foobar)不可能匹配 foobar ,因为它总是总是先匹配 foo

In other words, because foo is part of foobar, there is no way (foo|foobar) will ever match foobar, since it will always match foo first.

有时,这可能是一个非常有用的技巧,实际上。模式(o | a |(\w))将允许您捕获 \w a o 有所不同:

Occasionally, this can be a very useful trick, actually. The pattern (o|a|(\w)) will allow you to capture \w and a or o differently:

Regex.Replace("a foobar& b", "(o|a|(\\w))", "$2") // fbr& b

这篇关于为什么选择的顺序在正则表达式中很重要?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆