在Python中解析log4j [英] Parsing log4j in Python
问题描述
我有一个日志文件(log4j xml-ish格式),我试图将信息提取出来并在我的Python模块中使用.我可以将此文件视为XML吗?我的直觉告诉我不...如果没有,解析数据的最佳方法是什么?以下是日志文件的一部分.该文件不包含您的标准doctype或版本标头,这就是为什么我说"xml-ish".
I have a log file (log4j xml-ish format) that I am trying to pull info out of and use in my Python module. Could I treat this file as if it were XML? My gut is telling me no... If not, what is the best way to parse the data? Below is a section of the log file. The file does not include your standard doctype or version headers which is why I said "xml-ish."
<log4j:event
logger="com.hp.cp.elk.impl.subscriptions.AsyncSimpleSubscriptionManager"
timestamp="1352320517430" level="DEBUG" thread="Thread-77">
<log4j:message><![CDATA[Broadcasting signals to subscribers...]]></log4j:message>
</log4j:event>
<log4j:event logger="com.hp.cp.jdf.idp.queue.IDPJobProgressMonitor"
timestamp="1352320517430" level="DEBUG" thread="IDPJobProgressMonitorThread">
<log4j:message><![CDATA[[JDFQueueEntry[ --> JDFAutoQueueEntry[ --> JDFElement[
--> <?xml version="1.0" encoding="UTF-8"?><QueueEntry
xmlns="http://www.CIP4.org/JDFSchema_1_1"
DescriptiveName="H44E61-6.pdf" DeviceID="HPPRO1-SM1"
EndTime="2012-11-07T10:58:18-08:00" JobID="Default" Priority="50"
QueueEntryID="d5fbbe98a1194e0da573b51a0c8040fb" Status="Completed"
SubmissionTime="2012-11-06T16:35:06-08:00"> <Comment AgentName="CIP4 JDF Writer
Java" AgentVersion="1.4a BLD 63" ID="c_121106_163506894_000804"
Name="JobSpec">WBG_4C_Flat_21up_BusCards_Duplex</Comment>
</QueueEntry>
] ] ]] queue entries changed.]]></log4j:message>
</log4j:event>
<log4j:event logger="com.hp.cp.jdf.idp.queue.IDPJobProgressMonitor"
timestamp="1352320517430" level="DEBUG" thread="IDPJobProgressMonitorThread">
<log4j:message><![CDATA[no active queue entries changed.]]></log4j:message>
</log4j:event>
抱歉,代码混乱,我只是想让大家都对格式有所了解.无论如何,我目前正试图从QueueEntryID="d5fbbe98a1194e0da573b51a0c8040fb"
中提取值,有什么建议吗?谢谢!
Sorry for the messy code, I just wanted to make you all can get an idea of the formatting. Anyway, I'm currently just trying to pull the value from QueueEntryID="d5fbbe98a1194e0da573b51a0c8040fb"
Any suggestions? Thank you!
推荐答案
我想您可以使用诸如DOM或SAX之类的标准XML工具来对此进行解析.否则,请玩re
或htmllib
.
I would imagine that you could use standard XML tools like DOM or SAX to parse this. Otherwise, have fun with re
or htmllib
.
这篇关于在Python中解析log4j的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!