正则表达式匹配需要很长时间才能执行 [英] Regex match take a very long time to execute

查看:67
本文介绍了正则表达式匹配需要很长时间才能执行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个正则表达式,将文件路径解析为不同的组(DRIVE,DIR,FILE,EXTENSION).

I wrote a regular expression that parses a file path into different group (DRIVE, DIR, FILE, EXTENSION).

^((?<DRIVE>[a-zA-Z]):\\)*((?<DIR>[a-zA-Z0-9_]+(([a-zA-Z0-9_\s_\-\.]*[a-zA-Z0-9_]+)|([a-zA-Z0-9_]+)))\\)*(?<FILE>([a-zA-Z0-9_]+(([a-zA-Z0-9_\s_\-\.]*[a-zA-Z0-9_]+)|([a-zA-Z0-9_]+))\.(?<EXTENSION>[a-zA-Z0-9]{1,6})$))

我用C#进行了测试.我要测试的路径正确时.结果很快,这正是我想要的.

I made a test in C#. When the path I want to test is correct. The result is very quick and this is what I wanted to expect.

string path = @"C:\Documents and Settings\jhr\My Documents\Visual Studio 2010\Projects\FileEncryptor\Dds.FileEncryptor\Dds.FileEncryptor.csproj";

=>好

但是当我尝试使用我知道不匹配的路径进行测试时,像这样:

But when I try to test with a path that I know that will not match, like this :

string path = @"C:\Documents and Settings\jhr\My Documents\Visual Studio 2010\Projects\FileEncryptor\Dds.FileEncryptor\Dds.FileEncryptor?!??????";

=>错误

当我调用这部分代码时,测试会冻结

The test freezes when I call this part of code

Match match = s_fileRegex.Match(path);

当我查看Process Explorer时,看到进程QTAgent32.exe挂在处理器的100%处.是什么意思?

When i look into my Process Explorer, I see the process QTAgent32.exe hanging at 100% of my processor. What does it mean ?

推荐答案

您遇到的问题称为灾难性的回溯,这是由于正则表达式可以匹配字符串开头的方式很多,由于.NET中的回溯正则表达式引擎,导致性能下降.

The problem you are experiencing is called catastrophic backtracking and is due to the large number of ways that you regular expression can match the start of the string, which gives slow performance due to the backtracking regular expression engine in .NET.

我认为您在正则表达式中经常使用 * . * 并不表示连接"-表示"0次或多次".例如,此处不应有 * :

I think you are using * too frequently in your regular expression. * does not mean "concatenate" - it means "0 or more times". For example there should not be a * here:

((?<DRIVE>[a-zA-Z]):\\)*

最多应有一个驱动器规格.您应在此处改用?,否则,如果您要强制使用驱动器规格,则根本不要使用量词.同样,您的正则表达式中似乎还有其他地方,其中的量词不正确.

There should be at most one drive specification. You should use ? instead here, or else no quantifier at all if you want the drive specification to be compulsory. Similarly there appear to be other places in your regular expression where the quantifier is incorrect.

这篇关于正则表达式匹配需要很长时间才能执行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆