使用 re.findall 在正则表达式中捕获命名组 [英] Capturing named groups in regex with re.findall

查看:37
本文介绍了使用 re.findall 在正则表达式中捕获命名组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我试图回答这个问题时:regex to split %python 中的年龄和值 我注意到我必须从 findall 的结果中重新排序组.例如:

data = """34% 通过 23% 失败 46% 延迟"""result = {key:value for value, key in re.findall('(\w+)%\s(\w+)', data)}打印(结果)>>>{'失败':'23','通过':'34','延迟':'46'}

这里的 findall 的结果是:

<预><代码>>>>re.findall('(\w+)%\s(\w+)', 数据)>>>[('34', 'passed'), ('23', 'failed'), ('46', 'deferred')]

有没有办法改变/指定使 re.findall 返回的组的顺序:

[('passed', '34'), ('failed', '23'), ('deferred', '46')]

澄清一下,问题是:

是否可以指定顺序或重新排列组以返回 re.findall 函数?

我使用上面的示例创建了一个字典,以提供您想要更改顺序的原因/用例(将键设为值,将值设为键)

进一步说明:

为了处理更大更复杂的正则表达式中的组,您可以命名组,但这些名称只有在您进行 re.search pr re.match 时才能访问.从我读到的内容来看,findall 对元组中返回的组有一个固定的索引,问题是有人知道如何修改这些索引.这将有助于更轻松、更直观地处理群组.

解决方案

Take 3,基于 此评论.

Ashwin 是正确的,findall 不保留命名的捕获组(例如 (?Pregex)).finditer 来救援!它一一返回单个匹配对象.简单例子:

data = """34% 通过 23% 失败 46% 延迟"""对于 m in re.finditer('(?P\w+)%\s(?P\w+)', data):打印(m.group('百分比'),m.group('word'))

When I was trying to answer this question: regex to split %ages and values in python I noticed that I had to re-order the groups from the result of findall. For example:

data = """34% passed 23% failed 46% deferred"""
result = {key:value for value, key in re.findall('(\w+)%\s(\w+)', data)}
print(result)
>>> {'failed': '23', 'passed': '34', 'deferred': '46'}

Here the result of the findall is:

>>> re.findall('(\w+)%\s(\w+)', data)
>>> [('34', 'passed'), ('23', 'failed'), ('46', 'deferred')]

Is there a way to change/specify the order of the groups that makes re.findall return:

[('passed', '34'), ('failed', '23'), ('deferred', '46')]

Just to clarify, the question is:

Is it possible to specfic the order or re-order the groups for the return of the re.findall function?

I used the example above to create a dictionary to provide a reason/use case for when you would want to change the order (making key as value and value as key)

Further clarification:

In order to handle groups in larger more complicated regexes, you can name groups, but those names are only accessible when you do a re.search pr re.match. From what I have read, findall has a fixed indices for groups returned in the tuple, The question is anyone know how those indices could be modified. This would help make handling of groups easier and intuitive.

解决方案

Take 3, based on a further clarification of the OP's intent in this comment.

Ashwin is correct that findall does not preserve named capture groups (e.g. (?P<name>regex)). finditer to the rescue! It returns the individual match objects one-by-one. Simple example:

data = """34% passed 23% failed 46% deferred"""
for m in re.finditer('(?P<percentage>\w+)%\s(?P<word>\w+)', data):
    print( m.group('percentage'), m.group('word') )

这篇关于使用 re.findall 在正则表达式中捕获命名组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆