正则表达式匹配需要很长时间才能执行 [英] Regex match take a very long time to execute
问题描述
我写了一个正则表达式,将文件路径解析为不同的组(DRIVE,DIR,FILE,EXTENSION).
I wrote a regular expression that parses a file path into different group (DRIVE, DIR, FILE, EXTENSION).
^((?<DRIVE>[a-zA-Z]):\\)*((?<DIR>[a-zA-Z0-9_]+(([a-zA-Z0-9_\s_\-\.]*[a-zA-Z0-9_]+)|([a-zA-Z0-9_]+)))\\)*(?<FILE>([a-zA-Z0-9_]+(([a-zA-Z0-9_\s_\-\.]*[a-zA-Z0-9_]+)|([a-zA-Z0-9_]+))\.(?<EXTENSION>[a-zA-Z0-9]{1,6})$))
我用C#进行了测试.我要测试的路径正确时.结果很快,这正是我想要的.
I made a test in C#. When the path I want to test is correct. The result is very quick and this is what I wanted to expect.
string path = @"C:\Documents and Settings\jhr\My Documents\Visual Studio 2010\Projects\FileEncryptor\Dds.FileEncryptor\Dds.FileEncryptor.csproj";
=>好
但是当我尝试使用我知道不匹配的路径进行测试时,像这样:
But when I try to test with a path that I know that will not match, like this :
string path = @"C:\Documents and Settings\jhr\My Documents\Visual Studio 2010\Projects\FileEncryptor\Dds.FileEncryptor\Dds.FileEncryptor?!??????";
=>错误
当我调用这部分代码时,测试会冻结
The test freezes when I call this part of code
Match match = s_fileRegex.Match(path);
当我查看Process Explorer时,看到进程QTAgent32.exe挂在处理器的100%处.是什么意思?
When i look into my Process Explorer, I see the process QTAgent32.exe hanging at 100% of my processor. What does it mean ?
推荐答案
您遇到的问题称为灾难性的回溯,这是由于正则表达式可以匹配字符串开头的方式很多,由于.NET中的回溯正则表达式引擎,导致性能下降.
The problem you are experiencing is called catastrophic backtracking and is due to the large number of ways that you regular expression can match the start of the string, which gives slow performance due to the backtracking regular expression engine in .NET.
我认为您在正则表达式中经常使用 *
. *
并不表示连接"-表示"0次或多次".例如,此处不应有 *
:
I think you are using *
too frequently in your regular expression. *
does not mean "concatenate" - it means "0 or more times". For example there should not be a *
here:
((?<DRIVE>[a-zA-Z]):\\)*
最多应有一个驱动器规格.您应在此处改用?
,否则,如果您要强制使用驱动器规格,则根本不要使用量词.同样,您的正则表达式中似乎还有其他地方,其中的量词不正确.
There should be at most one drive specification. You should use ?
instead here, or else no quantifier at all if you want the drive specification to be compulsory. Similarly there appear to be other places in your regular expression where the quantifier is incorrect.
这篇关于正则表达式匹配需要很长时间才能执行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!