使用正则表达式在字符串中多次拆分 [英] multiple split in string using regex

查看:46
本文介绍了使用正则表达式在字符串中多次拆分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个字符串:

 Station Disconnect:1.3.6.1.4.1.11.2.14.11.15.2.75.3.2.0.8 StaMAC:00:9F:0B:00:38:B8 BSSID:00 9F Radioid:2

我想拆分这个字符串.看起来像这样 -

I want split this string. It look like this -

'Station Disconnect:1.3.6.1.4.1.11.2.14.11.15.2.75.3.2.0.8' 'StaMAC:00:9F:0B:00:38:B8' 'BSSID:00 9F' 'Radioid:2'

我尝试了这个逻辑 - msgRegex = re.compile('[\w\s]+:') 和 split 函数.我该怎么办请帮帮我谢谢

I tried this logic - msgRegex = re.compile('[\w\s]+:') and split function also. How can I do Please help me Thank you

推荐答案

据我所知,当在十六进制值的匹配项中有空格时,您会遇到问题.

From what I see, you have a problem when you have a whitespace inside the matches with hex values.

因此,我相信您不能在这里使用拆分方法.将您的令牌与正则表达式相匹配

Because of that, I believe you cannot use a splitting approach here. Match your tokens with a regex like

(?<!\S)\b([^:]+):((?:[a-fA-F0-9]{2}(?:[ :][a-fA-F0-9]{2})*|\S)+)\b

查看正则表达式演示

Python 代码:

import re

rx = r"(?<!\S)\b([^:]+):((?:[a-fA-F0-9]{2}(?:[ :][a-fA-F0-9]{2})*|\S)+)\b"
ss = ["Station Disconnect:1.3.6.1.4.1.11.2.14.11.15.2.75.3.2.0.8 StaMAC:00:9F:0B:00:38:B8 BSSID:00 9F Radioid:2",
    "Station Deassoc:1.3.6.1.4.1.11.2.14.11.15.2.75.3.2.0.5 StaMac1:40:83:DE:34:04:75 StaMac2:40:83:DE:34:04:75 UserName:4083de340475 StaMac3:40:83:DE:34:04:75 VLANId:1 Radioid:2 SSIDName:Devices SessionDuration:12 APID:CN58G6749V AP Name:1023-noida-racking-zopnow BSSID:BC:EA:FA:DC:A6:F1"]
for s in ss:
    matches = re.findall(rx, s)
    print(matches)

结果:

[('Station Disconnect', '1.3.6.1.4.1.11.2.14.11.15.2.75.3.2.0.8'), ('StaMAC', '00:9F:0B:00:38:B8'), ('BSSID', '00 9F'), ('Radioid', '2')]
[('Station Deassoc', '1.3.6.1.4.1.11.2.14.11.15.2.75.3.2.0.5'), ('StaMac1', '40:83:DE:34:04:75'), ('StaMac2', '40:83:DE:34:04:75'), ('UserName', '4083de340475'), ('StaMac3', '40:83:DE:34:04:75'), ('VLANId', '1'), ('Radioid', '2'), ('SSIDName', 'Devices'), ('SessionDuration', '12'), ('APID', 'CN58G6749V'), ('AP Name', '1023-noida-racking-zopnow'), ('BSSID', 'BC:EA:FA:DC:A6:F1')] 

注意:如果结果中不需要元组,请从模式中删除捕获括号.

NOTE: If you need no tuples in the result, remove the capturing parentheses from the pattern.

模式详情:

  • (?<!\S)\b - 字符串或空格的开头,后跟单词边界(下一个字符必须是字母/数字或 _)
  • ([^:]+) - 捕获组 #1:除 :
  • 之外的 1+ 个字符
  • : - 一个冒号
  • ((?:[a-fA-F0-9]{2}(?:[ :][a-fA-F0-9]{2})*|\S)+) - 捕获匹配一个或多个匹配项的组 2:
    • [a-fA-F0-9]{2}(?:[ :][a-fA-F0-9]{2})* - 2 个十六进制字符后跟零或更多出现的空格或 : 和 2 个十六进制字符
    • | - 或
    • \S - 一个非空白字符
    • (?<!\S)\b - start of string or whitespace followed with a word boundary (next char must be a letter/digit or _)
    • ([^:]+) - Capturing group #1: 1+ chars other than :
    • : - a colon
    • ((?:[a-fA-F0-9]{2}(?:[ :][a-fA-F0-9]{2})*|\S)+) - Capturing group 2 matching one or more occurrences of:
      • [a-fA-F0-9]{2}(?:[ :][a-fA-F0-9]{2})* - 2 hex chars followed with zero or more occurrences of a space or : and 2 hex chars
      • | - or
      • \S - a non-whitespace char

      这篇关于使用正则表达式在字符串中多次拆分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆