在lxml中,如何删除标签但保留所有内容? [英] In lxml, how do I remove a tag but retain all contents?

查看:306
本文介绍了在lxml中,如何删除标签但保留所有内容?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题是这样的:我有一个像这样的XML片段:

The problem is this: I have an XML fragment like so:

<fragment>text1 <a>inner1 </a>text2 <b>inner2</b> <c>t</c>ext3</fragment>

对于结果,我想删除所有<a>-和<c> -Tag,但保留它们的(文本)内容和子节点不变.同样,<b>-元素应保持不变.结果应该看起来像这样

For the result, I want to remove all <a>- and <c>-Tags, but retain their (text)-contents, and childnodes just as they are. Also, the <b>-Element should be left untouched. The result should then look thus

<fragment>text1 inner<d>1</d> text2 <b>inner2</b> text3</fragment>

暂时,我将回到一个非常肮脏的把戏:我将etree.tostring这个片段,通过正则表达式删除有问题的标签,并用etree.fromstring结果替换原始片段(不是真实的代码,但应使用类似的代码):

For the time being, I'll revert to a very dirty trick: I'll etree.tostring the fragment, remove the offending tags via regex, and replace the original fragment with the etree.fromstring result of this (not the real code, but should go something like this):

from lxml import etree
fragment = etree.fromstring("<fragment>text1 <a>inner1 </a>text2 <b>inner2</b> <c>t</c>ext3</fragment>")
fstring = etree.tostring(fragment)
fstring = fstring.replace("<a>","")
fstring = fstring.replace("</a>","")
fstring = fstring.replace("<c>","")
fstring = fstring.replace("</c>","")
fragment = etree.fromstring(fstring)

我知道我可能可以使用xslt来实现这一点,并且我知道lxml可以利用xslt,但是必须有更多的lxml本机方法吗?

I know that I can probably use xslt to achieve this, and I know that lxml can make use of xslt, but there has to be a more lxml native approach?

作为参考:我已经尝试使用lxml的element.replace到达那里,但是由于我想在之前有元素节点的地方插入文本,所以我认为我不能做到这一点.

For reference: I've tried getting there with lxml's element.replace, but since I want to insert text where there was an element node before, I don't think I can do that.

推荐答案

请尝试以下操作: http://lxml.de/api/lxml.etree-module.html#strip_tags

>>> etree.strip_tags(fragment,'a','c')
>>> etree.tostring(fragment)
'<fragment>text1 inner1 text2 <b>inner2</b> text3</fragment>'

这篇关于在lxml中,如何删除标签但保留所有内容?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆