如何使用 BeautifulSoup 更改标签名称? [英] How to change tag name with BeautifulSoup?
问题描述
我正在使用 python + BeautifulSoup 来解析 HTML 文档.
I am using python + BeautifulSoup to parse an HTML document.
现在我需要用 <h1 class="someclass">
替换 HTML 文档中的所有 <h2 class="someclass">
元素.
Now I need to replace all <h2 class="someclass">
elements in an HTML document, with <h1 class="someclass">
.
如何更改标签名称而不更改文档中的任何其他内容?
How can I change the tag name, without changing anything else in the document?
推荐答案
我不知道你是如何访问 tag
但以下对我有用:
I don't know how you're accessing tag
but the following works for me:
import BeautifulSoup
if __name__ == "__main__":
data = """
<html>
<h2 class='someclass'>some title</h2>
<ul>
<li>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</li>
<li>Aliquam tincidunt mauris eu risus.</li>
<li>Vestibulum auctor dapibus neque.</li>
</ul>
</html>
"""
soup = BeautifulSoup.BeautifulSoup(data)
h2 = soup.find('h2')
h2.name = 'h1'
print soup
print soup
命令的输出是:
<html>
<h1 class='someclass'>some title</h1>
<ul>
<li>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</li>
<li>Aliquam tincidunt mauris eu risus.</li>
<li>Vestibulum auctor dapibus neque.</li>
</ul>
</html>
如你所见,h2
变成了 h1
.文档中的任何其他内容都没有改变.我使用的是 Python 2.6 和 BeautifulSoup 3.2.0.
As you can see, h2
became h1
. And nothing else in the document changed. I am using Python 2.6 and BeautifulSoup 3.2.0.
如果你有多个 h2
并且你想改变它们,你可以简单地做:
If you have more than one h2
and you want to change them all, you could simple do:
soup = BeautifulSoup.BeautifulSoup(your_data)
while True:
h2 = soup.find('h2')
if not h2:
break
h2.name = 'h1'
这篇关于如何使用 BeautifulSoup 更改标签名称?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!