关于问号“懒惰"的正则表达式表示“是".模式 [英] Regular expression in regards to question mark "lazy" mode

查看：77 发布时间：2020/7/1 19:51:17 regex regex-greedy non-greedy

本文介绍了关于问号“懒惰"的正则表达式表示“是".模式的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我了解?标记在这里的意思是懒惰".

I understand the ? mark here means "lazy".

我的问题本质上是[0-9]{2}? vs [0-9]{2}

My question essentially is [0-9]{2}? vs [0-9]{2}

它们是相同的吗?
如果是这样，我们为什么要写前一个表达式?惰性模式不是更昂贵的性能明智的选择吗?
如果没有，您能分辨出差异吗?

Are they same?
If so, why are we writing the former expression? Aren't lazy mode more expensive performance wise?
If not, can you tell the difference?

推荐答案

[0-9]{2}和[0-9]{2}?之间没有没有.

There is not a difference between [0-9]{2} and [0-9]{2}?.

贪婪匹配和惰性匹配(添加?)之间的区别与回溯有关.构建正则表达式引擎来匹配文本(从左到右).因此，当您要求一个表达式匹配一系列字符时，它会尽可能地匹配.

The difference between greedy matching and lazy matching (the addition of a ?) has to do with backtracking. Regular expression engines are built to match text (from left to right). Therefore it is logical that when you ask an expression to match a range of character(s), it matches as many as possible.

假设我们有字符串acac123.

如果我们使用[a-z]+c的贪婪匹配(+代表1个重复或{1,}):

If we use a greedy match of [a-z]+c (+ standing for 1+ repetitions or {1,}):

[a-z]+将与acac匹配，并在1
然后我们将尝试匹配c，但在1
现在我们开始回溯，并成功匹配aca和c

[a-z]+ would match acac and fail at 1
then we would try to match the c, but fail at 1
now we start backtracking, and successfully match aca and c

如果我们使这个懒惰([a-z]+?c)，我们将得到不同的响应(在 this 情况下)，并且效率更高:

If we make this lazy ([a-z]+?c), we will get both a different response (in this case) and be more efficient:

[a-z]+?会匹配a，但会停止，因为它看到下一个字符与表达式c
c将匹配，成功匹配a和c(无回溯)

[a-z]+? would match a, but stop because it sees the next character matches the rest of the expression c
the c would then match, successfully matching a and c (with no backtracking)

现在您可以看到X{#}和X{#}?之间没有差异，因为{#}不是范围，即使是贪婪的比赛也不会经历任何回溯.惰性匹配通常与*(0个重复或{0,})或+一起使用，但也可以与范围{m,n}(其中n是可选的)一起使用.

Now you can see that there will be no difference between X{#} and X{#}?, because {#} is not a range and even a greedy match will not experience any backtracking. Lazily matches are often used with * (0+ repetitions or {0,}) or +, but can also be used with ranges {m,n} (where n is optional).

当您希望匹配尽可能少的字符时，这是必不可少的；当您要填充一些空间(字符串foo bar filler text bar上的foo.*?bar)时，经常会在表达式中看到.*?.但是，许多情况下，延迟匹配是不良/无效正则表达式的一个示例.许多人会做类似foo:"(.*?)"的操作来匹配双引号中的所有内容，这时您可以通过编写类似foo:"([^"]+)"的表达式来避免惰性匹配并匹配任何但 " s.

This is essential when you want to match the least amount of characters possible and you will often see .*? in an expression when you want to fill up some space (foo.*?bar on a string foo bar filler text bar). However, many times a lazy match is an example of bad/inefficient regex. Many people will do something like foo:"(.*?)" to match everything within double quotes, when you can avoid a lazy match by writing your expression like foo:"([^"]+)" and match anything but "s.

最后的注释，?通常表示可选"或匹配{0,1}次.如果在范围({m,n}，*，+或其他?)上使用?，则只会使匹配延迟.这意味着X?不会使X变得懒惰(因为我们已经说过{#}?是没有意义的)，但是它将是可选的.但是，您可以进行惰性的可选"匹配:[0-9]??将延迟匹配0-1次.

Final note, ? typically means "optional" or match {0,1} times. ? only will make a match lazy if you use it on a range ({m,n}, *, +, or another ?). This means X? will not make X lazy (since we already said {#}? is pointless), but instead it will be optional. However, you can do a lazy "optional" match: [0-9]?? will lazily match 0-1 times.

这篇关于关于问号“懒惰"的正则表达式表示“是".模式的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

关于问号“懒惰"的正则表达式表示“是".模式 [英] Regular expression in regards to question mark "lazy" mode

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

关于问号“懒惰"的正则表达式表示“是".模式 [英] Regular expression in regards to question mark &quot;lazy&quot; mode

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

关于问号“懒惰"的正则表达式表示“是".模式 [英] Regular expression in regards to question mark "lazy" mode

登录关闭