TCL 字符串匹配与正则表达式 [英] TCL string match vs regexps

查看:82
本文介绍了TCL 字符串匹配与正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们应该避免使用正则表达式是否正确,因为它很慢.相反,我们应该使用字符串操作.是否有两种情况都可以使用但正则表达式更好的情况?

解决方案

您应该为工作使用适当的工具.这意味着,您不应避免使用正则表达式,而应在必要时使用它.

如果您只是搜索固定的字符序列,请使用字符串操作.

如果您要搜索模式,请使用正则表达式.

<块引用>

示例

搜索单词Foo".使用字符串操作它也会找到Foobar",可以吗?不,那么也许搜索Foo",但是然后它不会找到Foo"和Foo".

使用正则表达式没问题,您可以匹配单词边界/\mFoo\M/和这个正则表达式不会很慢.

我认为这种负面形象来自于诸如灾难性回溯之类的特殊问题.>

最近有一个例子(灾难性的-backtracking-shouldnt-be-happening-on-this-regex) 在此行为是意外的.

结论

正则表达式必须精心设计,否则性能可能是灾难性的.但如果您使用了错误的算法,同样的情况也可能发生在您的正常代码中.

对于小型工作,使用正则表达式几乎永远不会成为问题,如果您的任务更大且必须经常重复,请进行基准测试.

根据我自己的经验,我正在分析非常大的文本文件(数百 MB)并使用正则表达式来查找我感兴趣的行,并且我没有因为正则表达式而遇到性能问题.

这里是关于代码优化的有趣读物

Is it right that we should avoid using regexp as it is slow. Instead we should use string operations. Are there cases that both can be used but regexp is better?

解决方案

You should use the appropriate tool for the job. That means, you should not avoid regex, you should use it when it is necessary.

If you are just searching for a fixed sequence of characters, use string operations.

If you are searching for a pattern, then use regular expressions.

Example

Search for the word "Foo". use string operations it will also find "Foobar", is this OK? NO, well then maybe search for "Foo ", but then it will not find "Foo," and "Foo."

With regex no problem, you can match for a word boundary /\mFoo\M/ and this regex will not be slow.

I think this negative image comes from special problems like catastrophic backtracking.

There has been a recent example (catastrophic-backtracking-shouldnt-be-happening-on-this-regex) where this behaviour was unexpected.

Conclusion

A regex has to be well designed, if it isn't then the performance can be catastrophic. But the same can also happen to your normal code if you use a bad algorithm.

For a small job it should nearly never be a problem to use a regex, if your task is bigger and has to be repeated often, do a benchmark.

From my own experience, I am analyzing really big text files (some hundred MB) and use regexes to find the rows I am interested in and I don't experience performance problems because of regex.

Here an interesting read about code optimization

这篇关于TCL 字符串匹配与正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆