如何在 Python 中获取 XML 根元素的内容? [英] How to fetch content of XML root element in Python?

查看:36
本文介绍了如何在 Python 中获取 XML 根元素的内容?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 XML 文件,例如:

I have an XML file, e.g.:

<?xml version="1.0" encoding="UTF-8"?>
<root>
    First line. <br/> Second line.
</root>

作为我想得到的输出:'\n第一行.<br/>第二行.\n'我只想注意,如果根元素包含其他嵌套元素,它们应该按原样返回.

As an output I want to get: '\nFirst line. <br/> Second line.\n' I just want to notice, if the root element contains other nested elements, they should be returned as is.

推荐答案

我想出的第一个:

from xml.etree.ElementTree import fromstring, tostring

source = '''<?xml version="1.0" encoding="UTF-8"?>
<root>
    First line.<br/>Second line.
</root>
'''

xml = fromstring(source)
result = tostring(xml).lstrip('<%s>' % xml.tag).rstrip('</%s>' % xml.tag)

print result

# output:
#
#   First line.<br/>Second line. 
#

但这并不是真正的通用方法,因为如果打开的根元素 () 包含任何属性,它就会失败.

But it's not truly general-purpose approach since it fails if opening root element (<root>) contains any attribute.

更新:这种方法还有另一个问题.由于 lstriprstrip 匹配给定字符的任意组合,您可能会遇到这样的问题:

UPDATE: This approach has another issue. Since lstrip and rstrip match any combination of given chars, you can face such problem:

# input:
<?xml version="1.0" encoding="UTF-8"?><root><p>First line</p></root>

# result:
p>First line</p

如果你真的只需要开始和结束标签之间的文字字符串(如你在评论中提到的),你可以使用这个:

If your really need only literal string between the opening and closing tags (as you mentioned in the comment), you can use this:

from string import index, rindex
from xml.etree.ElementTree import fromstring, tostring

source = '''<?xml version="1.0" encoding="UTF-8"?>
<root attr1="val1">
    First line.<br/>Second line.
</root>
'''

# following two lines are needed just to cut
# declaration, doctypes, etc.
xml = fromstring(source)
xml_str = tostring(xml)

start = index(xml_str, '>')
end = rindex(xml_str, '<')

result = xml_str[start + 1 : -(len(xml_str) - end)]

不是最优雅的方法,但与前一种方法不同,它可以正确处理开始标记中的属性以及任何有效的 xml 文档.

Not the most elegant approach, but unlike the previous one it works correctly with attributes within opening tag as well as with any valid xml document.

这篇关于如何在 Python 中获取 XML 根元素的内容?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆