跨多行的python正则表达式 [英] python regular expression across multiple lines

查看:61
本文介绍了跨多行的python正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 python 和 pexpect 从一些 cisco 设备收集一些信息,并且在使用 RE 提取讨厌的小项目方面取得了很多成功.恐怕我已经碰壁了.一些开关堆叠在一起,我在脚本中确定了这一点,并使用单独的例程来解析数据.如果交换机是堆叠的,您会看到以下内容(从 sho ver 输出中提取)

顶部组件部件号:800-25858-06顶部组件修订号:A0版本号:V08CLEI 代号:COMDE10BRA硬件板修订号:0x01交换机端口型号 SW 版本 SW 图像------ ----- ----- ---------- ----------* 1 52 WS-C3750-48P 12.2(35)SE5 C3750-IPBASE-M2 52 WS-C3750-48P 12.2(35)SE5 C3750-IPBASE-M3 52 WS-C3750-48P 12.2(35)SE5 C3750-IPBASE-M4 52 WS-C3750-48P 12.2(35)SE5 C3750-IPBASE-M开关02---------交换机正常运行时间:11 周 2 天 16 小时 27 分钟基本以太网 MAC 地址:00:26:52:96:2A:80主板组装号:73-9675-15

当我遇到这种情况时,我需要提取开关号 &4 表中每个的模型,(sw 可以被忽略,但可以有 1 到 9 个开关)这是多行的事情让我很满意,因为我对其他人没问题.请问有什么想法吗?

好的,抱歉.我的正则表达式只是开始查看最后一组 - 直到......然后我无法工作你去哪里!
-{10]\s-{10}(.+)开关

模型会改变,开关的数量也会改变,我需要捕捉这个例子中的4行

* 1 52 WS-C3750-48P 12.2(35)SE5 C3750-IPBASE-M2 52 WS-C3750-48P 12.2(35)SE5 C3750-IPBASE-M3 52 WS-C3750-48P 12.2(35)SE5 C3750-IPBASE-M4 52 WS-C3750-48P 12.2(35)SE5 C3750-IPBASE-M

但每个开关可能是不同的型号,可能有 1 到 9 个.对于这个例子,理想情况下我想得到

*,1,WS-C3750-48P,2,WS-C3750-48P,3,WS-C3750-48P,4,WS-C3750-48P

(星号代表大师)
但得到这些台词会让我走上正轨

解决方案

x="""Top Assembly Part Number : 800-25858-06顶部组件修订号:A0版本号:V08CLEI 代号:COMDE10BRA硬件板修订号:0x01交换机端口型号 SW 版本 SW 图像------ ----- ----- ---------- ----------* 1 52 WS-C3750-48P 12.2(35)SE5 C3750-IPBASE-M2 52 WS-C3750-48P 12.2(35)SE5 C3750-IPBASE-M3 52 WS-C3750-48P 12.2(35)SE5 C3750-IPBASE-M4 52 WS-C3750-48P 12.2(35)SE5 C3750-IPBASE-M开关02---------交换机正常运行时间:11 周 2 天 16 小时 27 分钟基本以太网 MAC 地址:00:26:52:96:2A:80主板组件编号:73-9675-15""">>>进口重新>>>re.findall("^\*?\s*(\d)\s*\d+\s*([A-Z\d-]+)",x,re.MULTILINE)[('1', 'WS-C3750-48P'), ('2', 'WS-C3750-48P'), ('3', 'WS-C3750-48P'), ('4', 'WS-C3750-48P')]

更新:因为 OP 编辑​​了问题,感谢 Tom 指出 +

<预><代码>>>>re.findall("^(\*?)\s+(\d)\s+\d+\s+([A-Z\d-]+)",x,re.MULTILINE)[('*', '1', 'WS-C3750-48P'), ('', '2', 'WS-C3750-48P'), ('', '3', 'WS-C3750-48P'), ('', '4', 'WS-C3750-48P')]>>>

I'm gathering some info from some cisco devices using python and pexpect, and had a lot of success with REs to extract pesky little items. I'm afraid i've hit the wall on this. Some switches stack together, I have identified this in the script and used a separate routine to parse the data. If the switch is stacked you see the following (extracted from the sho ver output)

Top Assembly Part Number        : 800-25858-06
Top Assembly Revision Number    : A0
Version ID                      : V08
CLEI Code Number                : COMDE10BRA
Hardware Board Revision Number  : 0x01


Switch   Ports  Model              SW Version              SW Image
------   -----  -----              ----------              ----------
*    1   52     WS-C3750-48P       12.2(35)SE5             C3750-IPBASE-M  
     2   52     WS-C3750-48P       12.2(35)SE5             C3750-IPBASE-M
     3   52     WS-C3750-48P       12.2(35)SE5             C3750-IPBASE-M
     4   52     WS-C3750-48P       12.2(35)SE5             C3750-IPBASE-M


Switch 02 
---------
Switch Uptime                   : 11 weeks, 2 days, 16 hours, 27 minutes
Base ethernet MAC Address       : 00:26:52:96:2A:80
Motherboard assembly number     : 73-9675-15

When I encounter this I need to extract the switch number & model for each in the table of 4, (sw can be ignored, but there can be between 1 and 9 switches) It's the multiple line thing that has got me as I've been ok with the rest. Any ideas please?

OK apologies. My regex simply started looking at the last group of - until.. then I couldn't work ou where to go!
-{10]\s-{10}(.+)Switch

The model will change and the number of switches will change, I need to capture the 4 lines in this example which are

*    1   52     WS-C3750-48P       12.2(35)SE5             C3750-IPBASE-M  
     2   52     WS-C3750-48P       12.2(35)SE5             C3750-IPBASE-M
     3   52     WS-C3750-48P       12.2(35)SE5             C3750-IPBASE-M
     4   52     WS-C3750-48P       12.2(35)SE5             C3750-IPBASE-M

But each switch could be a different model and there could be between 1 and 9. For this example ideally i'd like to get

*,1,WS-C3750-48P
,2,WS-C3750-48P
,3,WS-C3750-48P
,4,WS-C3750-48P  

(the asterisk means master)
but getting those lines would set me on the right track

解决方案

x="""Top Assembly Part Number        : 800-25858-06
Top Assembly Revision Number    : A0
Version ID                      : V08
CLEI Code Number                : COMDE10BRA
Hardware Board Revision Number  : 0x01


Switch   Ports  Model              SW Version              SW Image
------   -----  -----              ----------              ----------
*    1   52     WS-C3750-48P       12.2(35)SE5             C3750-IPBASE-M
     2   52     WS-C3750-48P       12.2(35)SE5             C3750-IPBASE-M
     3   52     WS-C3750-48P       12.2(35)SE5             C3750-IPBASE-M
     4   52     WS-C3750-48P       12.2(35)SE5             C3750-IPBASE-M


Switch 02
---------
Switch Uptime                   : 11 weeks, 2 days, 16 hours, 27 minutes
Base ethernet MAC Address       : 00:26:52:96:2A:80
Motherboard assembly number     : 73-9675-15"""

>>> import re
>>> re.findall("^\*?\s*(\d)\s*\d+\s*([A-Z\d-]+)",x,re.MULTILINE)
[('1', 'WS-C3750-48P'), ('2', 'WS-C3750-48P'), ('3', 'WS-C3750-48P'), ('4', 'WS-C3750-48P')]

UPDATE: because OP edited question, and Thanks Tom for pointing out for +

>>> re.findall("^(\*?)\s+(\d)\s+\d+\s+([A-Z\d-]+)",x,re.MULTILINE)
[('*', '1', 'WS-C3750-48P'), ('', '2', 'WS-C3750-48P'), ('', '3', 'WS-C3750-48P'), ('', '4', 'WS-C3750-48P')]
>>>

这篇关于跨多行的python正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆