为什么我的非贪婪的Perl正则表达式仍然匹配太多? [英] Why does my non-greedy Perl regex still match too much?

查看:229
本文介绍了为什么我的非贪婪的Perl正则表达式仍然匹配太多?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说,我一行包含以下字符串:

Say, I have a line that contains the following string:


"$tom" said blah blah blash.  "$dick" said "blah blah blah". "$harry" said blah blah blah.

我要提取


"$dick" said "blah blah blah"

我有以下代码:

my ($term) = /(".+?" said ".+?")/g;
print $term;

但是它给了我更多的需求:

But it gives me more than I need:


"$tom" said blah blah blash.  "$dick" said "blah blah blah"

我尝试通过使用不捕获的括号将我的模式整体分组:

I tried grouping my pattern as a whole by using the non-capturing parens:

my ($term) = /((?:".+?" said ".+?"))/g;

但是问题仍然存在.

我已经重新阅读了学习Perl的非贪婪量词"部分,但到目前为止我仍然一无所获.

I've reread the Nongreedy Quantifiers section of Learning Perl but it's got me nowhere so far.

感谢您可以慷慨提供的指导:)

Thanks for any guidance you can generously offer :)

推荐答案

问题是,即使它不是贪婪的,它仍会继续尝试.正则表达式看不到

The problem is that, even though it's not greedy, it still keeps trying. The regex doesn't see

"$tom" said blah blah blash.

,并认为哦,所说"后面的内容没有被引用,因此我将跳过该内容."它认为嗯,所说"后面的内容没有被引用,因此它必须仍然是我们引用的一部分."所以".+?"匹配

and think "Oh, the stuff following the "said" isn't quoted, so I'll skip that one." It thinks "well, the stuff after "said" isn't quoted, so it must still be part of our quote." So ".+?" matches

"$tom" said blah blah blash.  "$dick"

您想要的是"[^"]+".这将匹配两个引号,其中包含非引号.因此,最终的解决方案是:

What you want is "[^"]+". This will match two quote marks enclosing anything that's not a quote mark. So the final solution:

("[^"]+" said "[^"]+")

这篇关于为什么我的非贪婪的Perl正则表达式仍然匹配太多?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆