在Python中解析log4j [英] Parsing log4j in Python

查看:82
本文介绍了在Python中解析log4j的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个日志文件(log4j xml-ish格式),我试图将信息提取出来并在我的Python模块中使用.我可以将此文件视为XML吗?我的直觉告诉我不...如果没有,解析数据的最佳方法是什么?以下是日志文件的一部分.该文件不包含您的标准doctype或版本标头,这就是为什么我说"xml-ish".

I have a log file (log4j xml-ish format) that I am trying to pull info out of and use in my Python module. Could I treat this file as if it were XML? My gut is telling me no... If not, what is the best way to parse the data? Below is a section of the log file. The file does not include your standard doctype or version headers which is why I said "xml-ish."

<log4j:event 
logger="com.hp.cp.elk.impl.subscriptions.AsyncSimpleSubscriptionManager"
timestamp="1352320517430" level="DEBUG" thread="Thread-77">
<log4j:message><![CDATA[Broadcasting signals to subscribers...]]></log4j:message>
</log4j:event>

<log4j:event logger="com.hp.cp.jdf.idp.queue.IDPJobProgressMonitor"
timestamp="1352320517430" level="DEBUG" thread="IDPJobProgressMonitorThread">
<log4j:message><![CDATA[[JDFQueueEntry[  -->  JDFAutoQueueEntry[  --> JDFElement[
--> <?xml version="1.0" encoding="UTF-8"?><QueueEntry 
xmlns="http://www.CIP4.org/JDFSchema_1_1"
DescriptiveName="H44E61-6.pdf" DeviceID="HPPRO1-SM1"
EndTime="2012-11-07T10:58:18-08:00" JobID="Default" Priority="50"
QueueEntryID="d5fbbe98a1194e0da573b51a0c8040fb" Status="Completed" 
SubmissionTime="2012-11-06T16:35:06-08:00">  <Comment AgentName="CIP4 JDF Writer 
Java" AgentVersion="1.4a BLD 63" ID="c_121106_163506894_000804" 
Name="JobSpec">WBG_4C_Flat_21up_BusCards_Duplex</Comment>
</QueueEntry>
] ] ]] queue entries changed.]]></log4j:message>
</log4j:event>

<log4j:event logger="com.hp.cp.jdf.idp.queue.IDPJobProgressMonitor" 
timestamp="1352320517430" level="DEBUG" thread="IDPJobProgressMonitorThread">
<log4j:message><![CDATA[no active queue entries changed.]]></log4j:message>
</log4j:event>

抱歉,代码混乱,我只是想让大家都对格式有所了解.无论如何,我目前正试图从QueueEntryID="d5fbbe98a1194e0da573b51a0c8040fb"中提取值,有什么建议吗?谢谢!

Sorry for the messy code, I just wanted to make you all can get an idea of the formatting. Anyway, I'm currently just trying to pull the value from QueueEntryID="d5fbbe98a1194e0da573b51a0c8040fb" Any suggestions? Thank you!

推荐答案

我想您可以使用诸如DOM或SAX之类的标准XML工具来对此进行解析.否则,请玩rehtmllib.

I would imagine that you could use standard XML tools like DOM or SAX to parse this. Otherwise, have fun with re or htmllib.

这篇关于在Python中解析log4j的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆