美丽的汤:提取标签之间的所有数据 [英] Beautiful Soup: extract all data between tags
问题描述
<p>
<strong>
<em>
Insurtech
</em>
</strong>
</p>
<p> .....Some data </p>
<p>
<strong>
<em>
Biometrics
</em>
</strong>
</p>
我尝试了这个:html_tags = soup.find_all('em')对于我在范围内(len(html_tags)-1):start_tag = html_tags [i]end_tag = html_tags [i + 1]between_tag =(soup_str.split(str(start_tag)))[1] .split(str(end_tag))[0]soup1 = BeautifulSoup(between_tag,'html.parser')我希望所有数据都从第一个 p-> strong-> em
到下一个 p-> strong-> em
标记.这是我的示例数据.预先感谢**
I tried this:
html_tags = soup.find_all('em')
for i in range(len(html_tags)-1):
start_tag = html_tags[i]
end_tag = html_tags[i+1]
between_tag = (soup_str.split(str(start_tag)))[1].split(str(end_tag))[0]
soup1 = BeautifulSoup(between_tag, 'html.parser')
I want all the data from first p->strong->em
to the next p->strong->em
tag.This is my sample data.Thanks in advance**
推荐答案
s = '''<p>
<strong>
<em>
Insurtech
</em>
</strong>
</p>
<p> .....Some data </p>
<p>
<strong>
<em>
Biometrics
</em>
</strong>
</p>'''
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
>>> list(soup.stripped_strings)
['Insurtech', '.....Some data', 'Biometrics']
这篇关于美丽的汤:提取标签之间的所有数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!