带有 Splunk 的正则表达式 [英] A regex with Splunk

查看:76
本文介绍了带有 Splunk 的正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的正则表达式有问题.

Got some troubles with my regex.

我有一些这样的台词:

SomeText#"C:\\","Shadow Copy Components:\\","E:\\",""
SomeText#"D:\\"
SomeText#"E:\\","Shadow Copy Components:\\"
SomeText#"SET SNAP_ID=serv.a.x.com_1380312019","BACKUP H:\\ USING \\\\?\\GLOBALROOT\\Device\\HarddiskVolumeShadowCopy47\\ OPTIONS:ALT_PATH_PREFIX=c:\\VERITAS\\NetBackup\\temp\\_vrts_frzn_img_3200\"
SomeText#"SET SNAP_ID=serv.a.x.com_1380312019","BACKUP Y:\\Libs USING \\\\?\\GLOBALROOT\\Device\\HarddiskVolumeShadowCopy47\\ OPTIONS:ALT_PATH_PREFIX=c:\\VERITAS\\NetBackup\\temp\\_vrts_frzn_img_3200\"

我想要的是获得一个名为 jobFileList 的组,其中包含每一行:

What i would like is to get a group named jobFileList containing for each line:

"C:\\","Shadow Copy Components:\\","E:\\",""
"D:\\"
"E:\\","Shadow Copy Components:\\"
H:\\
Y:\\Libs

你可以看到我只想要文件列表,但有时它只是 # 标记后的全文,有时我需要删除很多 **.事实是我不能在这种情况下使用脚本,所以我只需要一个正则表达式就可以做到这一点,不能在正则表达式之后做其他事情.

You can see i only want the file list, but some times its only the full text after the # mark and sometimes there is a lot of ** that i need to remove. Fact is i cant use a script for this case so i need to do this with only ONE regexp, can't just do a streplace of other stuff after the regex.

我所做的是:

SomeText(#.*BACKUP (?P<jobFileList>.*?) .*)?(#(?P<jobFileList>.*))?

但似乎我无法设置相同的 GroupName :( 如果我用另一个名称替换第二个 jobFileList ,它可以完美运行,但不是我需要的.

But seems i cant set the same GroupName :( If i replace the second jobFileList with another name its works perfectly but not what i need .

感谢您的帮助,

我也可以有一些类似的行:

I can also have some lines like :

SomeText#/ahol5d72_1_2
SomeText#/p7ol4a1p_1_2
SomeText#Gvadag04SANDsk_Daily
SomeText#/bck_reco_a9ol5765_1_2_827497669

在所有这些情况下,我需要在 # 标记之后添加所有文本.

In all these cases i need to have all the text after the # mark.

推荐答案

一个不依赖双反斜杠后双引号的版本:

A version which doesn't rely on the double quotes after the double backslash:

SomeText#(?:(.*?BACKUP) )?(?P<jobFileList>(?(1)[^ ]*|.*$))

这个:(?(1)[^ ]*|.*$) 是 Python 2.7.5 支持的条件组(可能适用于更高版本,但我不知道对于以前的).如果有 BACKUP,它会抓取所有非空格,如果没有 BACKUP,它会抓取所有内容,直到字符串的末尾.

This: (?(1)[^ ]*|.*$) is a conditional group that is supported in Python 2.7.5 (probably works for higher versions but I don't know for previous ones). If there's BACKUP, it grabs all the non-spaces and if there's no BACKUP, it grabs everything till the end of the string.

regex101 演示

根据评论,@timmalos 修改后起作用的正则表达式:

As per comment, the regex that worked after @timmalos' modifications:

\#(?P<G>.*?[^E]BACKUP\s)?(?P<G2>f:\\\\Mailbox\\\)?(?P<jobFileList>(?(G)(?(G2)[^\]|\S)‌​*|.*))

这篇关于带有 Splunk 的正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆