使用 Python 抓取网页时从链接中提取 href [英] Pulling the href from a link when web scraping using Python

查看：33 发布时间：2021/9/24 19:06:06 python html web-scraping

本文介绍了使用 Python 抓取网页时从链接中提取 href的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在从这个页面抓取:https://www.pro-football-reference.com/years/2018/week_1.htm

I am scraping from this page: https://www.pro-football-reference.com/years/2018/week_1.htm

这是美式足球比赛得分列表.我想打开第一场比赛数据的链接.显示的文字说最终".到目前为止，我的代码...

It is a list of game scores for American Football. I want to open the link to the stats for the first game. The text displayed for said says "Final". My code so far...

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup


#assigning url
my_url = "https://www.pro-football-reference.com/years/2018/week_1.htm"

# opening up connection, grabbing the page
raw_page = uReq(my_url)
page_html = raw_page.read()
raw_page.close()

# html parsing
page_soup = soup(page_html,"html.parser")

#find all games on page
games = page_soup.findAll("div",{"class":"game_summary expanded nohover"})

link = games[0].find("td",{"class":"right gamelink"})
print(link)

当我运行它时，我收到以下输出...

When I run this i receive the following output...

<a href="/boxscores/201809060phi.htm">Final</a>

如何仅将链接文本(即/boxscores/201809060phi.htm")分配给变量?

How do I assign only the link text (i.e. "/boxscores/201809060phi.htm") to a variable?

使用 Python 抓取网页时从链接中提取 href [英] Pulling the href from a link when web scraping using Python

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

使用 Python 抓取网页时从链接中提取 href [英] Pulling the href from a link when web scraping using Python

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭