在 Python 中获取 XML 节点的路径 [英] Getting the Path of an XML Node in Python

查看:25
本文介绍了在 Python 中获取 XML 节点的路径的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我经常处理相当复杂的文档,我有效地对其进行了逆向工程.我需要编写代码来修改某些节点的值,并将到达该节点的路径作为条件.

在检查 XML 数据以了解其结构时,拥有此路径的表示将非常有用.然后我可以使用代码中的路径到达该特定节点.

例如,我可以从下面的文档中获取表示 Catalog > Book > Textbook > Author.

Python 中是否有任何库允许我执行此操作?

<目录><书><教科书><作者=詹姆斯"/></教科书><期刊><科学><作者=詹姆斯"/></科学></期刊></目录></xml>

解决方案

lxml 库有一个 etree.Element.getpath 方法(在上一个链接中搜索 Generating XPath 表达式)"它返回一个结构化的、绝对的 XPath 表达式来查找该元素".以下是 lxml 库文档中的示例:

<预><代码>>>>a = etree.Element("a")>>>b = etree.SubElement(a, "b")>>>c = etree.SubElement(a, "c")>>>d1 = etree.SubElement(c, "d")>>>d2 = etree.SubElement(c, "d")>>>树 = etree.ElementTree(c)>>>打印(树.getpath(d2))/c/d[2]>>>tree.xpath(tree.getpath(d2)) == [d2]真的

I am often working with quite complex documents that I am effectively reverse engineering. I need to write code to modify the value of certain nodes with the path traversed to reach the node acting as a condition.

When examining the XML data to understand it's structure, it would be very useful to have a representation on of this path. I can then use the path in the code to get to that specific node.

For example, I can get the representation Catalog > Book > Textbook > Author from the document below.

Are there any libraries in Python that allows me to do this?

<xml>
    <Catalog>
        <Book>
            <Textbook>
                <Author ="James" />
            </Textbook>
        </Book>
        <Journal>
            <Science>
                <Author ="James" />
            </Science>
        </Journal>
    </Catalog>
</xml>

解决方案

The lxml library has a etree.Element.getpath method (search for Generating XPath expressions in the previous link) "which returns a structural, absolute XPath expression to find that element". Here's an example from the lxml library documentation:

>>> a  = etree.Element("a")
>>> b  = etree.SubElement(a, "b")
>>> c  = etree.SubElement(a, "c")
>>> d1 = etree.SubElement(c, "d")
>>> d2 = etree.SubElement(c, "d")

>>> tree = etree.ElementTree(c)
>>> print(tree.getpath(d2))
/c/d[2]
>>> tree.xpath(tree.getpath(d2)) == [d2]
True

这篇关于在 Python 中获取 XML 节点的路径的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆