python remove everything between< div class =" comment> .. any ...< / div> [英] python remove everything between <div class="comment> .. any... </div>
本文介绍了python remove everything between< div class =" comment> .. any ...< / div>的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
如何使用python 2.6删除一切,包括< div class =comment> ....删除所有....< / div>
我尝试了各种方式使用re.sub没有任何成功
谢谢
解决方案
解析器如 BeautifulSoup :
>>>来自BeautifulSoup import BeautifulSoup
>>>> soup = BeautifulSoup('< body>< div> 1< / div>< div class =comment>< strong> 2< / strong>< / div>< / body>')
>>>> for div in soup.findAll('div','comment'):
... div.extract()
...
< div class =comment> ; strong> 2< / strong>< / div>
>>>> soup
< body>< div> 1< / div>< / body&
how do you use python 2.6 to remove everything including the <div class="comment"> ....remove all ....</div>
i tried various way using re.sub without any success
Thank you
解决方案
This can be done easily and reliably using an HTML parser like BeautifulSoup:
>>> from BeautifulSoup import BeautifulSoup
>>> soup = BeautifulSoup('<body><div>1</div><div class="comment"><strong>2</strong></div></body>')
>>> for div in soup.findAll('div', 'comment'):
... div.extract()
...
<div class="comment"><strong>2</strong></div>
>>> soup
<body><div>1</div></body>
See this question for examples on why parsing HTML using regular expressions is a bad idea.
这篇关于python remove everything between< div class =" comment> .. any ...< / div>的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文