Beautiful Soup 4:如何用文本和另一个标签替换一个标签? [英] Beautiful Soup 4: How to replace a tag with text and another tag?

查看:43
本文介绍了Beautiful Soup 4:如何用文本和另一个标签替换一个标签?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想用另一个标签替换一个标签,并将旧标签的内容放在新标签之前.例如:

I want to replace a tag with another tag and put the contents of the old tag before the new one. For example:

我想改变这个:

<html>
<body>
<p>This is the <span id="1">first</span> paragraph</p>
<p>This is the <span id="2">second</span> paragraph</p>
</body>
</html>

进入这个:

<html>
<body>
<p>This is the first<sup>1</sup> paragraph</p>
<p>This is the second<sup>2</sup> paragraph</p>
</body>
</html>

我可以使用 find_all() 轻松找到所有 spans,从 id 属性中获取数字并使用 replace_with()<将一个标签替换为另一个标签/code>,但是如何用文本替换标签一个新标签或在替换标签之前插入文本?

I can easily find all spans with find_all(), get the number from the id attribute and replace one tag with another tag using replace_with(), but how do I replace a tag with text and a new tag or insert text before a replaced tag?

推荐答案

这个想法是找到每个带有 id 属性(span[id] CSS 选择器),使用 insert_after() 在它后面插入一个 sup 标签和 unwrap() 用它的内容替换标签:

The idea is to find every span tag with id attribute (span[id] CSS Selector), use insert_after() to insert a sup tag after it and unwrap() to replace the tag with it's contents:

from bs4 import BeautifulSoup

data = """
<html>
<body>
<p>This is the <span id="1">first</span> paragraph</p>
<p>This is the <span id="2">second</span> paragraph</p>
</body>
</html>
"""

soup = BeautifulSoup(data)
for span in soup.select('span[id]'):
    # insert sup tag after the span
    sup = soup.new_tag('sup')
    sup.string = span['id']
    span.insert_after(sup)

    # replace the span tag with it's contents
    span.unwrap()

print soup

打印:

<html>
<body>
<p>This is the first<sup>1</sup> paragraph</p>
<p>This is the second<sup>2</sup> paragraph</p>
</body>
</html>

这篇关于Beautiful Soup 4:如何用文本和另一个标签替换一个标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆