Python如何从XML文本节点中去除空格 [英] Python how to strip white-spaces from xml text nodes

查看:622
本文介绍了Python如何从XML文本节点中去除空格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个xml文件,如下所示

I have a xml file as follows

<Person>
<name>

 My Name

</name>
<Address>My Address</Address>
</Person>

该标记具有额外的新行,是否有任何快速的Python方式来修剪它并生成新的xml.

The tag has extra new lines, Is there any quick Pythonic way to trim this and generate a new xml.

我发现了这一点,但它只修剪了标签之间的值,而不是值 https://skyl.org/log/post/skyl/2010/04/remove-insignificant-whitespace-from-xml-string-with-python/

I found this but it trims only which are between tags not the value https://skyl.org/log/post/skyl/2010/04/remove-insignificant-whitespace-from-xml-string-with-python/

更新1-处理以下xml,该xml在<name>标记中有尾巴空格

Update 1 - Handle following xml which has tail spaces in <name> tag

<Person>
<name>

 My Name<shortname>My</short>

</name>
<Address>My Address</Address>
</Person>

以上两种xml都可接受的答案句柄

Accepted answer handle above both kind of xml's

更新2-我在下面的答案中发布了我的版本,我正在使用它删除所有类型的空格并使用xml编码在文件中生成漂亮的xml

https://stackoverflow.com/a/19396130/973699

推荐答案

使用lxml,您可以遍历所有元素并检查其是否包含文本到strip():

With lxml you can iterate over all elements and check if it has text to strip():

from lxml import etree

tree = etree.parse('xmlfile')
root = tree.getroot()

for elem in root.iter('*'):
    if elem.text is not None:
        elem.text = elem.text.strip()

print(etree.tostring(root))

它产生:

<Person><name>My Name</name>
<Address>My Address</Address>
</Person>


更新也要剥离tail文本:


UPDATE to strip tail text too:

from lxml import etree

tree = etree.parse('xmlfile')
root = tree.getroot()

for elem in root.iter('*'):
    if elem.text is not None:
        elem.text = elem.text.strip()
    if elem.tail is not None:
        elem.tail = elem.tail.strip()

print(etree.tostring(root, encoding="utf-8", xml_declaration=True))

这篇关于Python如何从XML文本节点中去除空格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆