解析Python的输出 [英] Parse Output for Python
问题描述
我的软件输出以下两种类型的输出:
My software outputs these two types of output:
-rwx------ Administrators/Domain Users 456220672 0% 2018-04-16 16:04:40 E:\\_WiE10-18.0.100-77.iso
-rwxrwx--- Administrators/unknown 6677 0% 2018-04-17 01:33:23 E:\\program files\\cluster groups\\sql server (mssqlserver)\\logs\\progress-MOD-1523883344023-3001-Windows.log
我想从两个输出中获取文件名:
I would like to get the file names from both outputs:
-
E:\\_WiE10-18.0.100-77.iso
,第一个 -
E:\\program files \\cluster groups \\sql server(mssqlserver)\\logs \\ progogress-MOD-1523883344023-3001 -Windows.log
,第二个
E:\\_WiE10-18.0.100-77.iso
, for the first oneE:\\program files\\cluster groups\\sql server (mssqlserver)\\logs\\progress-MOD-1523883344023-3001-Windows.log
, for the second one
如果我使用类似下面的代码,它赢了如果第二个参数中有空格,则无效。如果域用户名中没有任何空格,则它可以工作。
If i use something like the code below, it won't work if the second parameter has spaces in it. It works if there aren't any spaces in the Domain Username.
for item in outputs:
outputs.extend(item.split())
for item2 in [' '.join(outputs[6:])]:
new_list.append(item2)
如何单独获取所有参数,包括文件名?
How can I get all the parameters individually, including the filenames?
推荐答案
如果正则表达式是一个选项:
If regex is an option:
text = """-rwx------ Administrators/Domain Users 456220672 0% 2018-04-16 16:04:40 E:\\_WiE10-18.0.100-77.iso
-rwxrwx--- Administrators/unknown 6677 0% 2018-04-17 01:33:23 E:\\program files\\cluster groups\\sql server (mssqlserver)\\logs\\progress-MOD-1523883344023-3001-Windows.log"""
import re
for h in re.findall(r"^.*?\d\d:\d\d:\d\d (.*)",text,flags=re.MULTILINE):
print(h)
输出:
E:\_WiE10-18.0.100-77.iso
E:\program files\cluster groups\sql server (mssqlserver)\logs\progress-MOD-1523883344023-3001-Windows.log
模式解释:
模式 r^。*?\\\\\:\\ \:\\\\\ n(。*)
查找linestart '^'
+尽可能减少任何事情'。*?'
+时间戳'\d\d:\\\\\:\\\''
后跟一个空格并捕获它后面的所有内容直到一行结束。
The pattern r"^.*?\d\d:\d\d:\d\d (.*)"
looks for linestart '^'
+ as less anythings as possible '.*?'
+ the time-stamp '\d\d:\d\d:\d\d '
followed by a space and captures all behind it till end of line into a group.
它使用 re.MULTILINE
标志。
编辑:
捕获个别物品需要更多捕获组:
Capturing the individual things needs some more capturing groups:
import re
for h in re.findall(r"^([rwexXst-]+) ([^0-9]+) +\d+.+? +(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}) (.*)",text,flags=re.MULTILINE):
# ^^^^^^^^^^^^ ^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^
# flags grpName datetime filename
for k in h:
print(k)
print("")
输出:
-rwx------
Administrators/Domain Users
2018-04-16 16:04:40
E:\_WiE10-18.0.100-77.iso
-rwxrwx---
Administrators/unknown
2018-04-17 01:33:23
E:\program files\cluster groups\sql server (mssqlserver)\logs\progress-MOD-1523883344023-3001-Windows.log
这篇关于解析Python的输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!