如何提取div标签中的强元素 [英] How to extract the strong elements which are in div tag
本文介绍了如何提取div标签中的强元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我是网络爬网的新手.我正在使用Python抓取数据. 有人可以帮助我从以下位置提取数据吗?
I am new to web scraping. I am using Python to scrape the data. Can someone help me in how to extract data from:
<div class="dept"><strong>LENGTH:</strong> 15 credits</div>
我的输出应为LENGTH:15 credits
My output should be LENGTH: 15 credits
这是我的代码:
from urllib.request import urlopen
from bs4 import BeautifulSoup
length=bsObj.findAll("strong")
for leng in length:
print(leng.text,leng.next_sibling)
输出:
DELIVERY: Campus
LENGTH: 2 years
OFFERED BY: Olin Business School
但是我只想拥有LENGTH.
but I would like to have only LENGTH.
网站: http://www.mastersindatascience.org/specialties/business-analytics/
推荐答案
您应该对代码进行一些改进,以通过文本找到
You should improve your code a bit to locate the strong
element by text:
soup.find("strong", text="LENGTH:").next_sibling
或者,对于多种长度:
for length in soup.find_all("strong", text="LENGTH:"):
print(length.next_sibling.strip())
演示:
>>> import requests
>>> from bs4 import BeautifulSoup
>>>
>>> url = "http://www.mastersindatascience.org/specialties/business-analytics/"
>>> response = requests.get(url)
>>> soup = BeautifulSoup(response.content, "html.parser")
>>> for length in soup.find_all("strong", text="LENGTH:"):
... print(length.next_sibling.strip())
...
33 credit hours
15 months
48 Credits
...
12 months
1 year
这篇关于如何提取div标签中的强元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文