如何将用引号分隔的字符串与正则表达式匹配? [英] How can I match a quote-delimited string with a regex?

查看:205
本文介绍了如何将用引号分隔的字符串与正则表达式匹配?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我试图将用引号引起来的字符串与正则表达式匹配,则以下哪个是更好"(其中更好"表示效率更高,并且不太可能执行意外的操作):

If I'm trying to match a quote-delimited string with a regex, which of the following is "better" (where "better" means both more efficient and less likely to do something unexpected):

/"[^"]+"/ # match quote, then everything that's not a quote, then a quote

/".+?"/   # match quote, then *anything* (non-greedy), then a quote

为此问题假设空字符串(即")不是问题.在我看来(没有正则表达式新手,但肯定没有专家),这些将是等效的.

Assume for this question that empty strings (i.e. "") are not an issue. It seems to me (no regex newbie, but certainly no expert) that these will be equivalent.

更新:经过反思,我认为将+字符更改为*仍然可以正确处理空字符串.

Update: Upon reflection, I think changing the + characters to * will handle empty strings correctly anyway.

推荐答案

您应该使用数字1,因为数字2是不好的做法.考虑到紧随其后的开发人员想要匹配的字符串,其后跟一个感叹号.他应该使用:

You should use number one, because number two is bad practice. Consider that the developer who comes after you wants to match strings that are followed by an exclamation point. Should he use:

"[^"]*"!

或:

".*?"!

当您拥有主题时,就会出现区别:

The difference appears when you have the subject:

"one" "two"!

第一个正则表达式匹配:

The first regex matches:

"two"!

第二个正则表达式匹配时:

while the second regex matches:

"one" "two"!

始终尽可能具体.可以使用否定的字符类.

Always be as specific as you can. Use the negated character class when you can.

另一个区别是[^] *可以跨行,而.*除非您使用单行模式,否则不能跨行.[^" \ n] *也排除换行符.

Another difference is that [^"]* can span across lines, while .* doesn't unless you use single line mode. [^"\n]* excludes the line breaks too.

对于回溯,第二个正则表达式回溯匹配的每个字符串中的每个字符.如果缺少引号,则两个正则表达式将回溯整个文件.只有回溯的顺序不同.因此,从理论上讲,第一个正则表达式更快.实际上,您不会注意到其中的区别.

As for backtracking, the second regex backtracks for each and every character in every string that it matches. If the closing quote is missing, both regexes will backtrack through the entire file. Only the order in which then backtrack is different. Thus, in theory, the first regex is faster. In practice, you won't notice the difference.

这篇关于如何将用引号分隔的字符串与正则表达式匹配?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆