解析Python的输出 [英] Parse Output for Python

查看：105 发布时间：2018/11/15 15:37:35 python regex iteration

本文介绍了解析Python的输出的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的软件输出以下两种类型的输出：

My software outputs these two types of output:

-rwx------ Administrators/Domain Users  456220672   0% 2018-04-16 16:04:40 E:\\_WiE10-18.0.100-77.iso

-rwxrwx--- Administrators/unknown        6677   0% 2018-04-17 01:33:23 E:\\program files\\cluster groups\\sql server (mssqlserver)\\logs\\progress-MOD-1523883344023-3001-Windows.log

我想从两个输出中获取文件名：

I would like to get the file names from both outputs:

E：\\_WiE10-18.0.100-77.iso ，第一个

E：\\program files \\cluster groups \\sql server（mssqlserver）\\logs \\ progogress-MOD-1523883344023-3001 -Windows.log ，第二个

E:\\_WiE10-18.0.100-77.iso, for the first one
E:\\program files\\cluster groups\\sql server (mssqlserver)\\logs\\progress-MOD-1523883344023-3001-Windows.log, for the second one

如果我使用类似下面的代码，它赢了如果第二个参数中有空格，则无效。如果域用户名中没有任何空格，则它可以工作。

If i use something like the code below, it won't work if the second parameter has spaces in it. It works if there aren't any spaces in the Domain Username.

for item in outputs:
    outputs.extend(item.split())
for item2 in [' '.join(outputs[6:])]:
    new_list.append(item2)

如何单独获取所有参数，包括文件名？

How can I get all the parameters individually, including the filenames?

推荐答案

如果正则表达式是一个选项：

If regex is an option:

text = """-rwx------ Administrators/Domain Users  456220672   0% 2018-04-16 16:04:40 E:\\_WiE10-18.0.100-77.iso

-rwxrwx--- Administrators/unknown        6677   0% 2018-04-17 01:33:23 E:\\program files\\cluster groups\\sql server (mssqlserver)\\logs\\progress-MOD-1523883344023-3001-Windows.log"""

import re

for h in re.findall(r"^.*?\d\d:\d\d:\d\d (.*)",text,flags=re.MULTILINE):
    print(h)

输出：

E:\_WiE10-18.0.100-77.iso
E:\program files\cluster groups\sql server (mssqlserver)\logs\progress-MOD-1523883344023-3001-Windows.log

模式解释：

模式 r^。*？\\\\\：\\ \：\\\\\ n（。*）查找linestart '^' +尽可能减少任何事情'。*？' +时间戳'\d\d：\\\\\：\\\''后跟一个空格并捕获它后面的所有内容直到一行结束。

The pattern r"^.*?\d\d:\d\d:\d\d (.*)" looks for linestart '^' + as less anythings as possible '.*?' + the time-stamp '\d\d:\d\d:\d\d ' followed by a space and captures all behind it till end of line into a group.

它使用 re.MULTILINE 标志。

编辑：

捕获个别物品需要更多捕获组：

Capturing the individual things needs some more capturing groups:

import re

for h in re.findall(r"^([rwexXst-]+) ([^0-9]+) +\d+.+? +(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}) (.*)",text,flags=re.MULTILINE):
#                       ^^^^^^^^^^^^ ^^^^^^^^           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^
#                          flags     grpName                               datetime          filename
    for k in h:
        print(k)
    print("")

输出：

-rwx------
Administrators/Domain Users 
2018-04-16 16:04:40
E:\_WiE10-18.0.100-77.iso

-rwxrwx---
Administrators/unknown       
2018-04-17 01:33:23
E:\program files\cluster groups\sql server (mssqlserver)\logs\progress-MOD-1523883344023-3001-Windows.log

这篇关于解析Python的输出的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

解析Python的输出 [英] Parse Output for Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

解析Python的输出 [英] Parse Output for Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭