使用python解析xml中的CDATA [英] Parsing CDATA in xml with python

查看:1921
本文介绍了使用python解析xml中的CDATA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要解析一个XML文件,其中包含一些我需要保留的CDATA块,以便以后进行绘图:

I need to parse an XML file with a number of blocks of CDATA that I need to retain for later plotting:

<process id="process1"> <log name="name1" device="device1"><![CDATA[timestamp value]]]></log> <log name="name2" device="device2"><![CDATA[timestamp value, timestamp value, timestamp]]]></log> </process>

<process id="process1"> <log name="name1" device="device1"><![CDATA[timestamp value]]]></log> <log name="name2" device="device2"><![CDATA[timestamp value, timestamp value, timestamp]]]></log> </process>

我将需要重复且快速地执行此操作,而我正在寻找执行此操作的最佳方法.我读过ElementTree是方法中比较快的方法,但是我对其他建议持开放态度.

I will need to do this repeatedly and quickly, and I am looking for the best way to do this. I've read that ElementTree is the faster of the methods, but I am open to other suggestions.

推荐答案

以下是两个示例:

from lxml import etree
import xml.etree.ElementTree as ElementTree

CONTENT = """
<process id="process1">
 <log name="name1" device="device1"><![CDATA[timestamp value]]></log>
 <log name="name2" device="device2"><![CDATA[timestamp value, timestamp value, timestamp]]></log>
</process>
"""

def parse_with_lxml():
    root = etree.fromstring(CONTENT)
    for log in root.xpath("//log"):
        print log.text

def parse_with_stdlib():
    root = ElementTree.fromstring(CONTENT)
    for log in root.iter('log'):
        print log.text

if __name__ == '__main__':
    parse_with_lxml()
    parse_with_stdlib()

输出:

timestamp value
timestamp value, timestamp value, timestamp
timestamp value
timestamp value, timestamp value, timestamp

在两种情况下,它都会处理text属性.

The text attribute it handles it in both cases.

这篇关于使用python解析xml中的CDATA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆