不在python中的正则表达式中返回整个模式 [英] not returning the whole pattern in regex in python

查看:29
本文介绍了不在python中的正则表达式中返回整个模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下代码:

haystack = "aaa months(3) bbb"
needle = re.compile(r'(months|days)\([\d]*\)')
instances = list(set(needle.findall(haystack)))
print str(instances)

我希望它打印 months(3) 但我只得到 months.有什么原因吗?

I'd expect it to print months(3) but instead I just get months. Is there any reason for this?

推荐答案

needle = re.compile(r'((?:months|days)\([\d]*\))')

解决您的问题.

您只捕获了月|天部分.

you were capturing only the months|days part.

在这种特定情况下,这个正则表达式要好一些:

in this specific situation, this regex is a bit better:

needle = re.compile(r'((?:months|days)\(\d+\))')

这样你只会得到一个数字的结果,以前像 months() 这样的结果可以工作.如果您想忽略月或日等选项的大小写,还可以添加 re.IGNORECASE 标志.像这样:

this way you will only get results with a number, previously a result like months() would work. if you want to ignore case for options like Months or Days, then also add the re.IGNORECASE flag. like this:

re.compile(r'((?:months|days)\(\d+\))', re.IGNORECASE)

对 OP 的一些解释:

一个正则表达式由许多元素组成,其中最主要的是捕获组."()" 但是有时候我们想不捕获就做分组,所以我们使用 "(?:)" 还有很多其他形式的分组,但这些是最多的常见的.

a regular expression is comprised of many elements, the chief among them is the capturing group. "()" but sometimes we want to make groups without capturing, so we use "(?:)" there are many other forms of groups, but these are the most common.

在这种情况下,我们将整个正则表达式包围在一个捕获组中,因为您试图捕获所有内容,通常 - 任何正则表达式都会自动被捕获组包围,但在这种情况下,您明确指定了一个,所以它没有用自动捕获组包围您的正则表达式.

in this case, we surround the entire regular expression in a capturing group, because you are trying to capture everything, normally - any regular expression is automatically surrounded by a capturing group, but in this case, you specified one explicitly, so it did not surround your regular expression with an automatic capture group.

既然我们已经用捕获组包围了整个正则表达式,我们通过在开头添加 ?: 将我们拥有的组变成非捕获组,如上所示.我们也不能包围整个正则表达式而只将组转换为非捕获组,因为如您所见,它会自动将整个正则表达式转换为存在 non 的捕获组.我个人更喜欢显式编码.

now that we have surrounded the entire regular expression with a capturing group, we turn the group we have into a non-capturing group by adding ?: to the beginning, as shown above. we could also not have surrounded the entire regular expression and only turned the group into a non-capturing group, since as you saw, it will automatically turn the whole regular expression into a capturing group where non is present. i personally prefer explicit coding.

关于正则表达式的更多信息可以在这里找到:http://docs.python.org/图书馆/re.html

further information about regular expressions can be found here: http://docs.python.org/library/re.html

这篇关于不在python中的正则表达式中返回整个模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆