使用Python lxml删除处理指令 [英] Removing Processing Instructions with Python lxml

查看：63 发布时间：2020/5/4 8:30:36 python xml lxml

本文介绍了使用Python lxml删除处理指令的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用python lxml库将XML文件转换为新模式，但是在解析XML正文中的处理指令时遇到了问题.

I am using the python lxml library to transform XML files to a new schema but I've encountered problems parsing processing instructions from the XML body.

处理指令元素分散在整个XML中，如以下示例所示(它们都以"oasys"开头，并以唯一的代码结尾):

The processing instruction elements are scattered throughout the XML, as in the following example (they all begin with "oasys" and end with a unique code):

string = "<text><?oasys _dc21-?>Text <i>contents</i></text>"

尽管etree.getchildren()返回它们，但我无法通过lxml.etree.findall()方法找到它们:

I can't locate them through the lxml.etree.findall() method, although etree.getchildren() returns them:

tree = lxml.etree.fromstring(string)
print tree.findall(".//")
>>>> [<Element i at 0x747c>]
print tree.getchildren()
>>>> [<?oasys _dc21-?>, <Element i at 0x747x>]
print tree.getchildren()[0].tag
>>>> <built-in function ProcessingInstruction>
print tree.getchildren()[0].tail
>>>> Text

是否存在使用getchildren()解析和删除处理指令的替代方法，尤其是考虑到它们嵌套在整个XML的各个级别上吗?

Is there an alternative to using getchildren() to parse and remove processing instructions, especially considering that they're nested at various levels throughout the XML?

推荐答案

您可以使用 processing-instruction() XPath节点测试，以找到处理指令并使用 etree.strip_tags() .

You can use the processing-instruction() XPath node test to find the processing instructions and remove them using etree.strip_tags().

示例:

from lxml import etree

string = "<text><?oasys _dc21-?>Text <i>contents</i></text>"
tree = etree.fromstring(string)

pis = tree.xpath("//processing-instruction()")
for pi in pis:
    etree.strip_tags(pi.getparent(), pi.tag)

print etree.tostring(tree)

输出:

<text>Text <i>contents</i></text>

这篇关于使用Python lxml删除处理指令的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用Python lxml删除处理指令 [英] Removing Processing Instructions with Python lxml

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用Python lxml删除处理指令 [英] Removing Processing Instructions with Python lxml

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭