可以告诉 ElementTree 保留属性的顺序吗? [英] Can ElementTree be told to preserve the order of attributes?

查看:27
本文介绍了可以告诉 ElementTree 保留属性的顺序吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经使用 ElementTree 在 python 中编写了一个相当简单的过滤器来处理一些 xml 文件的上下文.它或多或少地起作用.

I've written a fairly simple filter in python using ElementTree to munge the contexts of some xml files. And it works, more or less.

但是它重新排列了各种标签的属性,我希望它不要那样做.

But it reorders the attributes of various tags, and I'd like it to not do that.

有谁知道我可以扔一个开关来让它们保持指定的顺序吗?

Does anyone know a switch I can throw to make it keep them in specified order?

我正在使用并开发一个粒子物理工具,该工具具有基于 xml 文件的复杂但奇怪的有限配置系统.以这种方式设置的众多内容包括各种静态数据文件的路径.这些路径被硬编码到现有的 xml 中,并且没有用于根据环境变量设置或更改它们的工具,并且在我们的本地安装中,它们必须位于不同的位置.

I'm working with and on a particle physics tool that has a complex, but oddly limited configuration system based on xml files. Among the many things setup that way are the paths to various static data files. These paths are hardcoded into the existing xml and there are no facilities for setting or varying them based on environment variables, and in our local installation they are necessarily in a different place.

这不是灾难,因为我们使用的组合源和构建控制工具允许我们使用本地副本隐藏某些文件.但即使认为数据字段是静态的,xml 也不是静态的,所以我编写了一个脚本来修复路径,但是本地和主版本之间的属性重新排列差异比必要的更难阅读.

This isn't a disaster because the combined source- and build-control tool we're using allows us to shadow certain files with local copies. But even thought the data fields are static the xml isn't, so I've written a script for fixing the paths, but with the attribute rearrangement diffs between the local and master versions are harder to read than necessary.

这是我第一次尝试使用 ElementTree(也是我的第五个或第六个 Python 项目),所以也许我只是做错了.

This is my first time taking ElementTree for a spin (and only my fifth or sixth python project) so maybe I'm just doing it wrong.

为简单起见,代码如下所示:

Abstracted for simplicity the code looks like this:

tree = elementtree.ElementTree.parse(inputfile)
i = tree.getiterator()
for e in i:
    e.text = filter(e.text)
tree.write(outputfile)

合理还是愚蠢?

相关链接:

推荐答案

在@bobince 的回答和这两个答案的帮助下 (设置属性顺序覆盖模块方法)

With help from @bobince's answer and these two (setting attribute order, overriding module methods)

我设法给这只猴子打了补丁,它很脏,我建议使用另一个可以更好地处理这种情况的模块,但如果不可能:

I managed to get this monkey patched it's dirty and I'd suggest using another module that better handles this scenario but when that isn't a possibility:

# =======================================================================
# Monkey patch ElementTree
import xml.etree.ElementTree as ET

def _serialize_xml(write, elem, encoding, qnames, namespaces):
    tag = elem.tag
    text = elem.text
    if tag is ET.Comment:
        write("<!--%s-->" % ET._encode(text, encoding))
    elif tag is ET.ProcessingInstruction:
        write("<?%s?>" % ET._encode(text, encoding))
    else:
        tag = qnames[tag]
        if tag is None:
            if text:
                write(ET._escape_cdata(text, encoding))
            for e in elem:
                _serialize_xml(write, e, encoding, qnames, None)
        else:
            write("<" + tag)
            items = elem.items()
            if items or namespaces:
                if namespaces:
                    for v, k in sorted(namespaces.items(),
                                       key=lambda x: x[1]):  # sort on prefix
                        if k:
                            k = ":" + k
                        write(" xmlns%s="%s"" % (
                            k.encode(encoding),
                            ET._escape_attrib(v, encoding)
                            ))
                #for k, v in sorted(items):  # lexical order
                for k, v in items: # Monkey patch
                    if isinstance(k, ET.QName):
                        k = k.text
                    if isinstance(v, ET.QName):
                        v = qnames[v.text]
                    else:
                        v = ET._escape_attrib(v, encoding)
                    write(" %s="%s"" % (qnames[k], v))
            if text or len(elem):
                write(">")
                if text:
                    write(ET._escape_cdata(text, encoding))
                for e in elem:
                    _serialize_xml(write, e, encoding, qnames, None)
                write("</" + tag + ">")
            else:
                write(" />")
    if elem.tail:
        write(ET._escape_cdata(elem.tail, encoding))

ET._serialize_xml = _serialize_xml

from collections import OrderedDict

class OrderedXMLTreeBuilder(ET.XMLTreeBuilder):
    def _start_list(self, tag, attrib_in):
        fixname = self._fixname
        tag = fixname(tag)
        attrib = OrderedDict()
        if attrib_in:
            for i in range(0, len(attrib_in), 2):
                attrib[fixname(attrib_in[i])] = self._fixtext(attrib_in[i+1])
        return self._target.start(tag, attrib)

# =======================================================================

然后在您的代码中:

tree = ET.parse(pathToFile, OrderedXMLTreeBuilder())

这篇关于可以告诉 ElementTree 保留属性的顺序吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆