为什么xml包在Python3中修改我的xml文件? [英] Why does xml package modify my xml file in Python3?

查看:27
本文介绍了为什么xml包在Python3中修改我的xml文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用Python3.5 中的xml读取写入 xml 文件.我不修改文件.只需打开并写入.但是库会修改文件.

I use the xml library in Python3.5 for reading and writing an xml-file. I don't modify the file. Just open and write. But the library modifes the file.

  1. 为什么要修改?
  2. 如何防止这种情况发生?例如我只是想在一个相当复杂的 xml 文件中替换特定标签或它的值,而不会丢失任何其他信息.

这是示例文件

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<movie>
    <title>Der Eisbär</title>
    <ids>
        <entry>
            <key>tmdb</key>
            <value xsi:type="xs:int" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">9321</value>
        </entry>
        <entry>
            <key>imdb</key>
            <value xsi:type="xs:string" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">tt0167132</value>
        </entry>
    </ids>
</movie>

这是代码

import xml.etree.ElementTree as ET
tree = ET.parse('x.nfo')
tree.write('y.nfo', encoding='utf-8')

xml文件变成了这个

<movie xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <title>Der Eisbär</title>
    <ids>
        <entry>
            <key>tmdb</key>
            <value xsi:type="xs:int">9321</value>
        </entry>
        <entry>
            <key>imdb</key>
            <value xsi:type="xs:string">tt0167132</value>
        </entry>
    </ids>
</movie>

  • 第 1 行不见了.
  • 第 2 行中的 <movie>-标签现在有了属性.
  • 第 7 行和第 11 行中的 <value>-标签现在具有更少的属性.
    • Line 1 is gone.
    • The <movie>-tag in line 2 has attributes now.
    • The <value>-tag in line 7 and 11 now has less attributes.
    • 推荐答案

      请注意,xml 包"和xml 库"是不明确的.标准库中有几个与 XML 相关的模块:https://docs.python.org/3/library/xml.html.

      Note that "xml package" and "the xml library" are ambiguous. There are several XML-related modules in the standard library: https://docs.python.org/3/library/xml.html.

      为什么要修改?

      ElementTree 将命名空间声明移动到根元素,并删除文档中实际未使用的命名空间.

      ElementTree moves namespace declarations to the root element, and namespaces that aren't actually used in the document are removed.

      为什么 ElementTree 会这样做?我不知道,但也许这是一种使实现更简单的方法.

      Why does ElementTree do this? I don't know, but perhaps it is a way to make the implementation simpler.

      如何防止这种情况发生?例如我只想在一个相当复杂的 xml 文件中替换特定的标签或它的值,而不会丢失任何其他信息.

      How can I prevent this? e.g. I just want to replace specific tag or it's value in a quite complex xml-file without loosing any other informations.

      我认为没有办法防止这种情况发生.这个问题之前已经提过了.以下是两个非常相似但没有答案的问题:

      I don't think there is a way to prevent this. The issue has been brought up before. Here are two very similar questions with no answers:

      我的建议是使用 lxml 而不是 ElementTree.使用 lxml,命名空间声明将保留在它们在原始文件中出现的位置.

      My suggestion is to use lxml instead of ElementTree. With lxml, the namespace declarations will remain where they occur in the original file.

      第 1 行不见了.

      那一行是 XML 声明.建议但不是必须拥有一个.

      That line is the XML declaration. It is recommended but not mandatory to have one.

      如果您总是需要 XML 声明,请在 write() 方法调用中使用 xml_declaration=True.

      If you always want an XML declaration, use xml_declaration=True in the write() method call.

      这篇关于为什么xml包在Python3中修改我的xml文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆