如何防止xml.ElementTree fromString丢弃commentNode [英] How to prevent xml.ElementTree fromstring from dropping commentnode

查看:76
本文介绍了如何防止xml.ElementTree fromString丢弃commentNode的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下代码片段:

    from xml.etree.ElementTree import fromstring,tostring
    mathml = fromstring(input)
    for elem in mathml.getiterator():
        elem.tag = 'm:' + elem.tag
    return tostring(mathml)

当我输入以下输入时:

<math>
  <a> 1 2 3 </a>  <b />
<foo>Uitleg</foo>
<!-- <bar> -->
</math>

结果为:

<m:math>
  <m:a> 1 2 3 </m:a>  <m:b />
<m:foo>Uitleg</m:foo>

</m:math>

为什么?以及如何保存评论?

How come? And how can I preserve the comment?

编辑:我不在乎所使用的确切xml库,但是,我应该能够对标签进行粘贴更改。不幸的是,lxml似乎不允许这样做(并且我不能使用正确的名称空间操作)

edit: I don't care for the exact xml library used, however, I should be able to do the pasted change to the tags. Unfortunately, lxml does not seem to allow this (and I cannot use proper namespace operations)

推荐答案

您不能使用 xml.etree ,因为其解析器会忽略注释(顺便说一句,这对于xml解析器来说是可以接受的行为)。但是,如果您使用(兼容的) lxml 库,则可以进行配置,该库允许您配置解析器选项

You cannot with xml.etree, because its parser ignores comments (which is acceptable behaviour for an xml parser by the way). But you can if you use the (compatible) lxml library, which allows you to configure parser options.

from lxml import etree

parser = etree.XMLParser(remove_comments=False)
tree = etree.parse('input.xml', parser=parser)
# or alternatively set the parser as default:
# etree.set_default_parser(parser)

这是最简单的选项。如果确实需要使用xml.etree,则可以尝试连接自己的解析器,尽管即使这样,注释也不受官方支持:请查看此示例(来自xml.etree的作者)(顺便说一句,它似乎仍在python 2.7中起作用)

This would by far be the easiest option. If you really have to use xml.etree, you could try hooking up your own parser, although even then, comments are not officially supported: have a look at this example (from the author of xml.etree) (still seems to work in python 2.7 by the way)

这篇关于如何防止xml.ElementTree fromString丢弃commentNode的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆