Python和ElementTree：返回“内部XML”排除父元素 [英] Python and ElementTree: return "inner XML" excluding parent element

查看：63 发布时间：2020/10/28 20:38:47 python xml elementtree

本文介绍了Python和ElementTree：返回“内部XML”排除父元素的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在使用ElementTree的Python 2.6中，获取特定元素内的XML（作为字符串）的好方法是一种好方法，例如您可以使用 innerHTML ？

以下是我开始使用的XML节点的简化示例：

 < label attr = foo attr2 = bar>这是一些文本< a href = foo.htm>和链接< / a>在嵌入式HTML< / label>中

我想以以下字符串结尾：

 这是一些文本< a href = foo.htm>和链接< / a>在嵌入式HTML中

我尝试遍历父节点并连接 tostring（）的孩子，但这只给了我子节点：

 ＃仅返回子节点（例如< a href = foo.htm>和链接< / a>）
''.join（[et.tostring（sub，encoding = utf-8） ]）

我可以使用正则表达式破解一个解决方案，但希望能减少一些麻烦比这更古怪：

  re.sub（< / \w +？> \s *？$， ，re.sub（ ^ \s *？< \w *？>，，et.tostring（node，encoding = utf-8）））

解决方案

如何：

<$从xml.etree中的p $ p>

导入ElementTree作为ET 
 
 xml ='< root>从这里开始< child1>一些文本< sub1 />这里< / child1>和< child2> ;，以及< sub2 /< sub3 /> // child2>结束此处< / root>'
 root = ET.fromstring（xml）
 
 def content（tag）：
 return tag.text +''.join（ET.tostring（e）for tag in e）
 
打印内容（根）
打印内容（root.find（'child2'））

结果：

 从此处开始< child1>一些文本< sub1 //>此处< / child1>和< child2>此处也是< sub2 /> < sub3 />< / child2>在此结束
在这里< sub2 />< sub3 />

In Python 2.6 using ElementTree, what's a good way to fetch the XML (as a string) inside a particular element, like what you can do in HTML and javascript with innerHTML?

Here's a simplified sample of the XML node I am starting with:

<label attr="foo" attr2="bar">This is some text <a href="foo.htm">and a link</a> in embedded HTML</label>

I'd like to end up with this string:

This is some text <a href="foo.htm">and a link</a> in embedded HTML

I've tried iterating over the parent node and concatenating the tostring() of the children, but that gave me only the subnodes:

# returns only subnodes (e.g. <a href="foo.htm">and a link</a>)
''.join([et.tostring(sub, encoding="utf-8") for sub in node])

I can hack up a solution using regular expressions, but was hoping there'd be something less hacky than this:

re.sub("</\w+?>\s*?$", "", re.sub("^\s*?<\w*?>", "", et.tostring(node, encoding="utf-8")))

解决方案

How about:

from xml.etree import ElementTree as ET

xml = '<root>start here<child1>some text<sub1/>here</child1>and<child2>here as well<sub2/><sub3/></child2>end here</root>'
root = ET.fromstring(xml)

def content(tag):
    return tag.text + ''.join(ET.tostring(e) for e in tag)

print content(root)
print content(root.find('child2'))

Resulting in:

start here<child1>some text<sub1 />here</child1>and<child2>here as well<sub2 /><sub3 /></child2>end here
here as well<sub2 /><sub3 />

这篇关于Python和ElementTree：返回“内部XML”排除父元素的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python和ElementTree：返回“内部XML”排除父元素 [英] Python and ElementTree: return "inner XML" excluding parent element

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python和ElementTree：返回“内部XML”排除父元素 [英] Python and ElementTree: return &quot;inner XML&quot; excluding parent element

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

Python和ElementTree：返回“内部XML”排除父元素 [英] Python and ElementTree: return "inner XML" excluding parent element

登录关闭