如何提取div标签中的强元素 [英] How to extract the strong elements which are in div tag

查看:61
本文介绍了如何提取div标签中的强元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是网络爬网的新手.我正在使用Python抓取数据. 有人可以帮助我从以下位置提取数据吗?

I am new to web scraping. I am using Python to scrape the data. Can someone help me in how to extract data from:

<div class="dept"><strong>LENGTH:</strong> 15 credits</div>

我的输出应为LENGTH:15 credits

My output should be LENGTH: 15 credits

这是我的代码:

from urllib.request import urlopen
from bs4 import BeautifulSoup 

length=bsObj.findAll("strong")
for leng in length:
    print(leng.text,leng.next_sibling)

输出:

DELIVERY:  Campus
LENGTH:  2 years
OFFERED BY:  Olin Business School

但是我只想拥有LENGTH.

but I would like to have only LENGTH.

网站: http://www.mastersindatascience.org/specialties/business-analytics/

推荐答案

您应该对代码进行一些改进,以通过文本找到元素 :

You should improve your code a bit to locate the strong element by text:

soup.find("strong", text="LENGTH:").next_sibling

或者,对于多种长度:

for length in soup.find_all("strong", text="LENGTH:"):
    print(length.next_sibling.strip())

演示:

>>> import requests
>>> from bs4 import BeautifulSoup
>>>
>>> url = "http://www.mastersindatascience.org/specialties/business-analytics/"
>>> response = requests.get(url)
>>> soup = BeautifulSoup(response.content, "html.parser")
>>> for length in soup.find_all("strong", text="LENGTH:"):
...     print(length.next_sibling.strip())
... 
33 credit hours
15 months
48 Credits
...
12 months
1 year

这篇关于如何提取div标签中的强元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆