.* 和有什么不一样?和 .* 正则表达式? [英] What is the difference between .*? and .* regular expressions?

查看:36
本文介绍了.* 和有什么不一样?和 .* 正则表达式?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用正则表达式将字符串分成两部分.字符串格式如下:

要提取的文本<数字>

我一直在使用 (.*?)<<(.*?)> ,它们工作正常,但在阅读正则表达式后,我刚开始想知道为什么我需要在表达式中使用 ?.我只是在通过本网站找到它们后才这样做的,所以我不确定有什么区别.

解决方案

这是贪婪量词和非贪婪量词的区别.

考虑输入 101000000000100.

使用1.*1*是贪心的——它会一直匹配到最后,然后回溯直到可以匹配到1代码>,给你留下1010000000001.
.*? 是非贪婪的.* 将不匹配任何内容,但随后会尝试匹配额外的字符,直到它匹配 1,最终匹配 101.

所有量词都具有非贪婪模式:.*?.+?.{2,6}? 和甚至 .??.

在您的情况下,类似的模式可能是 <([^>]*)> - 匹配除大于号以外的任何内容(严格来说,它匹配零个或多个字符除了 > 介于 <> 之间).

请参阅量词备忘单.>

I'm trying to split up a string into two parts using regex. The string is formatted as follows:

text to extract<number>

I've been using (.*?)< and <(.*?)> which work fine but after reading into regex a little, I've just started to wonder why I need the ? in the expressions. I've only done it like that after finding them through this site so I'm not exactly sure what the difference is.

解决方案

It is the difference between greedy and non-greedy quantifiers.

Consider the input 101000000000100.

Using 1.*1, * is greedy - it will match all the way to the end, and then backtrack until it can match 1, leaving you with 1010000000001.
.*? is non-greedy. * will match nothing, but then will try to match extra characters until it matches 1, eventually matching 101.

All quantifiers have a non-greedy mode: .*?, .+?, .{2,6}?, and even .??.

In your case, a similar pattern could be <([^>]*)> - matching anything but a greater-than sign (strictly speaking, it matches zero or more characters other than > in-between < and >).

See Quantifier Cheat Sheet.

这篇关于.* 和有什么不一样?和 .* 正则表达式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆