Python 重复捕获组 [英] Python Repeated Capture Groups
问题描述
我正在尝试解析一系列 SHOW CDP NEIGHBORS DETAIL 输出,以便我可以捕获每个设备及其 IP 地址.
I'm attempting to parse a series of SHOW CDP NEIGHBORS DETAIL outputs so I can capture each device and its ip address.
我遇到的问题是某些设备可能分配了多个 IP 地址,这是一个示例输出.
The issue that I am coming across is that some devices may have multiple ip addresses assigned to it, here is an example output.
Device ID: RTPER1.MFN21Mb.domain.local
Entry address(es):
IP address: 200.152.51.3
IP address: 82.159.177.233
IP address: 201.152.51.140
IP address: 84.252.100.3
Platform: Cisco 2821, Capabilities: Router Switch IGMP
我写了这个正则表达式来捕获输入,根据 gskinner 它匹配所有 4 个 ip 地址,但捕获只是最后一个(正如预期的正则表达式)
I wrote this regex to capture the input, and according to gskinner it matches all 4 ip addresses, but the capture is just the last one (as expected from regex)
Device ID: ([0-9A-Za-z-.&]+)\s+Entry address\(es\):\s+(?:IP address: ([0-9.]+)\s+)+
所以我上网想知道如何做到这一点.我尝试了此处建议的正则表达式 捕获 Python 正则表达式中的重复子模式 但使用正则表达式模块没有改变输出.我仍然只得到列表中的最后一个 ip 地址,其他的都没有.
So I went online to figure out how to do this. I tried teh regex suggested here Capturing repeating subpatterns in Python regex but using the regex module did not change the output. I still only get the last ip address on the list, and none of the others.
按照我得到的例子
temp = regex.match(r'Device ID: ([0-9A-Za-z-.&]+)\s+Entry address\(es\):\s+(?:IP address: ([0-9.]+)\s+)+', file)
print temp
Temp 返回 None.
Temp returns None.
如果我找到所有.我得到的只是最后一个 ip 地址 84.252.100.3
If I do findall. I get a return of just the last ip address 84.252.100.3
如果我添加多个捕获组,比如
If I add multiple capture groups, such as
temp = regex.findall(r'Device ID: ([0-9A-Za-z-.&]+)\s+Entry address\(es\):\s+(?:IP address: ([0-9.]+)\s+)?\s+(?:IP address: ([0-9.]+)\s+)?\s+(?:IP address: ([0-9.]+)\s+)?\s+(?:IP address: ([0-9.]+)\s+)?\s+(?:IP address: ([0-9.]+)\s+)?', file)
print temp
只匹配有多个ip地址的,不匹配其他的
It only matches the ones that have mutliple ip addresses, and not the others
希望有人能帮忙.
推荐答案
据我所知,只有 .NET 允许您迭代量化(重复)捕获组.考虑这个(有限的)替代方案:
As far as I'm aware, only .NET allows you to iterate through quantified (repeated) capturing groups. Consider this (finite) alternative:
Device ID: ([0-9A-Za-z-.&]+)\s+Entry address\(es\):\s+(?:IP address: ([0-9.]+)\s+)(?:IP address: ([0-9.]+)\s+)?(?:IP address: ([0-9.]+)\s+)?(?:IP address: ([0-9.]+)\s+)?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
这将在 $2
中捕获 1 个 IP 地址,在 $3
、$4
和 $5> 中捕获最多三个 IP 地址代码>.(当然,我习惯地使用
$
表示法.)您可以添加任意数量的.如果您需要所有 IP 地址都出现在一个组中,i.e. $2
,那么您唯一的选择是将文本包含在它们中:
This will capture up 1 IP address in $2
and up to three more in $3
, $4
, and $5
. (I'm using the $
notation idiomatically, of course.) You can add as many as you want. If you need all of the IP addresses to be present in a single group, i.e. $2
, then your only choice is to include the text with them:
Device ID: ([0-9A-Za-z-.&]+)\s+Entry address\(es\):\s+((?:IP address: (?:[0-9.]+)\s+)+)
^ ^^ ^
这篇关于Python 重复捕获组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!