Python2的re模块中关于MatchObject的group系列方法不解

查看:134
本文介绍了Python2的re模块中关于MatchObject的group系列方法不解的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问 题

问题

不知是自己理解不对,还是自己的需求不对

我的问题描述在相关代码中(下文)

  • RegexObject.search (不是re.search)

search(string[, pos[, endpos]])
Scan through string looking for a location where this regular expression produces a match, and return a corresponding MatchObject instance. Return None if no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.

The optional second parameter pos gives an index in the string where the search is to start; it defaults to 0. This is not completely equivalent to slicing the string; the '^' pattern character matches at the real beginning of the string and at positions just after a newline, but not necessarily at the index where the search is to start.

The optional parameter endpos limits how far the string will be searched; it will be as if the string is endpos characters long, so only the characters from pos to endpos - 1 will be searched for a match. If endpos is less than pos, no match will be found, otherwise, if rx is a compiled regular expression object, rx.search(string, 0, 50) is equivalent to rx.search(string[:50], 0).

  • MatchObject-group系列方法

group([group1, ...])
Returns one or more subgroups of the match. If there is a single argument, the result is a single string; if there are multiple arguments, the result is a tuple with one item per argument. Without arguments, group1 defaults to zero (the whole match is returned). If a groupN argument is zero, the corresponding return value is the entire matching string; if it is in the inclusive range [1..99], it is the string matching the corresponding parenthesized group. If a group number is negative or larger than the number of groups defined in the pattern, an IndexError exception is raised. If a group is contained in a part of the pattern that did not match, the corresponding result is None. If a group is contained in a part of the pattern that matched multiple times, the last match is returned.

If the regular expression uses the (?P<name>...) syntax, the groupN arguments may also be strings identifying groups by their group name. If a string argument is not used as a group name in the pattern, an IndexError exception is raised.

groups([default])
Return a tuple containing all the subgroups of the match, from 1 up to however many groups are in the pattern. The default argument is used for groups that did not participate in the match; it defaults to None. (Incompatibility note: in the original Python 1.5 release, if the tuple was one element long, a string would be returned instead. In later versions (from 1.5.1 on), a singleton tuple is returned in such cases.)

groupdict([default])
Return a dictionary containing all the named subgroups of the match, keyed by the subgroup name. The default argument is used for groups that did not participate in the match; it defaults to None. For example:

相关代码

s = """111999
    222888
    333777
    444666"""

    regex = re.compile(r'(?P<first>\d{3})(?P<second>\d{3})', re.MULTILINE)

    m = regex.search(s)

    print(regex.findall(s))
    print(m.groups())   # 不是所有的吗
    print(m.group(0))   # 不是所有的吗,怎么只有一部分?
    print(m.group('first'))  #不是所有的吗?
    print(m.groupdict())    #不是所有的吗?

    
output
[('111', '999'), ('222', '888'), ('333', '777'), ('444', '666')]
('111', '999')
111999
111
{'second': '999', 'first': '111'}

貌似group系列方法只会匹配第一个?

重现

  1. 拷贝代码,运行

  2. 查看输出(同时查看对应文档)

解决方案

这篇关于Python2的re模块中关于MatchObject的group系列方法不解的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆