使用正则表达式在字符串中多次拆分 [英] multiple split in string using regex
本文介绍了使用正则表达式在字符串中多次拆分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个字符串:
Station Disconnect:1.3.6.1.4.1.11.2.14.11.15.2.75.3.2.0.8 StaMAC:00:9F:0B:00:38:B8 BSSID:00 9F Radioid:2
我想拆分这个字符串.看起来像这样 -
I want split this string. It look like this -
'Station Disconnect:1.3.6.1.4.1.11.2.14.11.15.2.75.3.2.0.8' 'StaMAC:00:9F:0B:00:38:B8' 'BSSID:00 9F' 'Radioid:2'
我尝试了这个逻辑 - msgRegex = re.compile('[\w\s]+:') 和 split 函数.我该怎么办请帮帮我谢谢
I tried this logic - msgRegex = re.compile('[\w\s]+:') and split function also. How can I do Please help me Thank you
推荐答案
据我所知,当在十六进制值的匹配项中有空格时,您会遇到问题.
From what I see, you have a problem when you have a whitespace inside the matches with hex values.
因此,我相信您不能在这里使用拆分方法.将您的令牌与正则表达式相匹配
Because of that, I believe you cannot use a splitting approach here. Match your tokens with a regex like
(?<!\S)\b([^:]+):((?:[a-fA-F0-9]{2}(?:[ :][a-fA-F0-9]{2})*|\S)+)\b
查看正则表达式演示
import re
rx = r"(?<!\S)\b([^:]+):((?:[a-fA-F0-9]{2}(?:[ :][a-fA-F0-9]{2})*|\S)+)\b"
ss = ["Station Disconnect:1.3.6.1.4.1.11.2.14.11.15.2.75.3.2.0.8 StaMAC:00:9F:0B:00:38:B8 BSSID:00 9F Radioid:2",
"Station Deassoc:1.3.6.1.4.1.11.2.14.11.15.2.75.3.2.0.5 StaMac1:40:83:DE:34:04:75 StaMac2:40:83:DE:34:04:75 UserName:4083de340475 StaMac3:40:83:DE:34:04:75 VLANId:1 Radioid:2 SSIDName:Devices SessionDuration:12 APID:CN58G6749V AP Name:1023-noida-racking-zopnow BSSID:BC:EA:FA:DC:A6:F1"]
for s in ss:
matches = re.findall(rx, s)
print(matches)
结果:
[('Station Disconnect', '1.3.6.1.4.1.11.2.14.11.15.2.75.3.2.0.8'), ('StaMAC', '00:9F:0B:00:38:B8'), ('BSSID', '00 9F'), ('Radioid', '2')]
[('Station Deassoc', '1.3.6.1.4.1.11.2.14.11.15.2.75.3.2.0.5'), ('StaMac1', '40:83:DE:34:04:75'), ('StaMac2', '40:83:DE:34:04:75'), ('UserName', '4083de340475'), ('StaMac3', '40:83:DE:34:04:75'), ('VLANId', '1'), ('Radioid', '2'), ('SSIDName', 'Devices'), ('SessionDuration', '12'), ('APID', 'CN58G6749V'), ('AP Name', '1023-noida-racking-zopnow'), ('BSSID', 'BC:EA:FA:DC:A6:F1')]
注意:如果结果中不需要元组,请从模式中删除捕获括号.
NOTE: If you need no tuples in the result, remove the capturing parentheses from the pattern.
模式详情:
(?<!\S)\b
- 字符串或空格的开头,后跟单词边界(下一个字符必须是字母/数字或_
)([^:]+)
- 捕获组 #1:除:
之外的 1+ 个字符:
- 一个冒号((?:[a-fA-F0-9]{2}(?:[ :][a-fA-F0-9]{2})*|\S)+)代码> - 捕获匹配一个或多个匹配项的组 2:
[a-fA-F0-9]{2}(?:[ :][a-fA-F0-9]{2})*
- 2 个十六进制字符后跟零或更多出现的空格或:
和 2 个十六进制字符|
- 或\S
- 一个非空白字符
(?<!\S)\b
- start of string or whitespace followed with a word boundary (next char must be a letter/digit or_
)([^:]+)
- Capturing group #1: 1+ chars other than:
:
- a colon((?:[a-fA-F0-9]{2}(?:[ :][a-fA-F0-9]{2})*|\S)+)
- Capturing group 2 matching one or more occurrences of:[a-fA-F0-9]{2}(?:[ :][a-fA-F0-9]{2})*
- 2 hex chars followed with zero or more occurrences of a space or:
and 2 hex chars|
- or\S
- a non-whitespace char
这篇关于使用正则表达式在字符串中多次拆分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文