将html字符串插入BeautifulSoup对象 [英] Insert html string into BeautifulSoup object
问题描述
我试图将html字符串插入BeautifulSoup对象.如果我直接插入它,bs4会清理html.如果采用html字符串并从中创建汤,并插入我在使用find
函数时遇到问题.关于SO的此帖子主题提示,插入BeautifulSoup对象可能会引起问题.我正在使用该帖子中的解决方案,并在每次插入时重新创建汤.
I am trying to insert an html string into a BeautifulSoup object. If I insert it directly, bs4 sanitizes the html. If take the html string and create a soup from it, and insert that I have problems with using the find
function. This post thread on SO suggests that inserting BeautifulSoup objects can cause problems. I am using the solution from that post and recreating the soup each time I do an insert.
但是,肯定有更好的方法将html字符串插入汤中.
But surely there's a better way to insert an html string into a soup.
我将添加一些代码作为问题所在的示例
from bs4 import BeautifulSoup
mainSoup = BeautifulSoup("""
<html>
<div class='first'></div>
<div class='second'></div>
</html>
""")
extraSoup = BeautifulSoup('<span class="first-content"></span>')
tag = mainSoup.find(class_='first')
tag.insert(1, extraSoup)
print mainSoup.find(class_='second')
# prints None
推荐答案
如果您已经有了html字符串,最简单的方法是插入另一个BeautifulSoup对象.
Simplest way, if you already have an html string, is to insert another BeautifulSoup object.
from bs4 import BeautifulSoup
doc = '''
<div>
test1
</div>
'''
soup = BeautifulSoup(doc, 'html.parser')
soup.div.append(BeautifulSoup('<div>insert1</div>', 'html.parser'))
print soup.prettify()
输出:
<div>
test1
<div>
insert1
</div>
</div>
更新1
这个怎么样?想法是使用BeautifulSoup生成正确的AST节点(span标签).这样看起来可以避免无"问题.
Update 1
How about this? Idea is to use BeautifulSoup to generate the right AST node (span tag). Looks like this avoids the "None" problem.
import bs4
from bs4 import BeautifulSoup
mainSoup = BeautifulSoup("""
<html>
<div class='first'></div>
<div class='second'></div>
</html>
""", 'html.parser')
extraSoup = BeautifulSoup('<span class="first-content"></span>', 'html.parser')
tag = mainSoup.find(class_='first')
tag.insert(1, extraSoup.span)
print mainSoup.find(class_='second')
输出:
<div class="second"></div>
这篇关于将html字符串插入BeautifulSoup对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!