BeautifulSoup在多个< div>之后获取内容等级 [英] BeautifulSoup getting content behind multiple <div> levels
本文介绍了BeautifulSoup在多个< div>之后获取内容等级的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
如何使用BeautifulSoup获取两个"div"后面的时间数据?
<div>
<div>
6:00.00
</div>
</div>
我尝试了以下代码
import requests
from bs4 import BeautifulSoup
page = requests.get("https://www.energystorageexchange.org/projects/2")
soup = BeautifulSoup(page.content, 'lxml')
rows = soup.select("div.div")
for r in rows:
print(r)
但这并不容易.
完整的HTML示例:
<div class='row'>
<hr class='border zeropadding zeromargin'>
<div class='col-md-6 zeropadding'>
<label class='new_font'>Duration at Rated Power (HH:MM)</label>
</div>
<div class='col-md-6 new_font'>
<div></div>
<div>
<div>
6:00.00
</div>
</div>
</div>
</hr>
</div>
<div class='row'>
<hr class='border zeropadding zeromargin'>
<div class='col-md-6 zeropadding new_font'>
<label class='new_font'>Weblink1</label>
</div>
<div class='col-md-6 new_font'>
<div>
<div class='show_value'>
<a href="http://www.gillsonions.com/node/192" target='_new' class='boldbluelink'>http://www.gillsonions.com/node/192</a>
</div>
</div>
它来自 https://www.energystorageexchange.org/projects/2 >
感谢您的帮助.
第二个问题:
我还想从
捕获以kW为单位的尺寸<input id='size_in_kw' type='hidden' value='1500'>
我已经尝试过了,但这似乎是不完整的:
value = soup.find('input', {'id': 'size_in_kw'}).get('value')
解决方案
第二个问题:
if "kW" in item.text:
itemval = item.find_parent().find_next_sibling().text.strip()
output.append(itemval)
How can I get the time data behind two "divs" with BeautifulSoup?
<div>
<div>
6:00.00
</div>
</div>
I've tried the following code
import requests
from bs4 import BeautifulSoup
page = requests.get("https://www.energystorageexchange.org/projects/2")
soup = BeautifulSoup(page.content, 'lxml')
rows = soup.select("div.div")
for r in rows:
print(r)
but it doesn't work that easy.
The full HTML sample:
<div class='row'>
<hr class='border zeropadding zeromargin'>
<div class='col-md-6 zeropadding'>
<label class='new_font'>Duration at Rated Power (HH:MM)</label>
</div>
<div class='col-md-6 new_font'>
<div></div>
<div>
<div>
6:00.00
</div>
</div>
</div>
</hr>
</div>
<div class='row'>
<hr class='border zeropadding zeromargin'>
<div class='col-md-6 zeropadding new_font'>
<label class='new_font'>Weblink1</label>
</div>
<div class='col-md-6 new_font'>
<div>
<div class='show_value'>
<a href="http://www.gillsonions.com/node/192" target='_new' class='boldbluelink'>http://www.gillsonions.com/node/192</a>
</div>
</div>
It's from https://www.energystorageexchange.org/projects/2
Thanks for any help.
2nd Question:
I would also like to capture size in kW from
<input id='size_in_kw' type='hidden' value='1500'>
I've tried this, but it seems to be incomplete:
value = soup.find('input', {'id': 'size_in_kw'}).get('value')
解决方案
To your second question:
if "kW" in item.text:
itemval = item.find_parent().find_next_sibling().text.strip()
output.append(itemval)
这篇关于BeautifulSoup在多个< div>之后获取内容等级的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文