使用Python和Beautiful汤进行网页抓取:错误“未定义“页面"" [英] Web scraping using Python and Beautiful soup: error "'page' is not defined"

查看:91
本文介绍了使用Python和Beautiful汤进行网页抓取:错误“未定义“页面""的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从一个下注网站,我想收集下注率.检查页面后,我注意到这些价格已包含在 eventprice 类中.遵循此处,因此我使用Beautifulsoup模块在Python中编写了以下代码:

From a betting site, I want to collect the betting rates. After inspecting the page, I noticed that these rates were included into a eventprice class. Following the explanation from here, I thus wrote this code in Python, using Beautifulsoup module:

from bs4 import BeautifulSoup
import urllib.request
import re

url = "http://sports.williamhill.com/bet/fr-fr"

try:
    page = urllib.request.urlopen(url)
except:
    print("An error occured.")

soup = BeautifulSoup(page, 'html.parser')

regex = re.compile('eventprice')
content_lis = soup.find_all('button', attrs={'class': regex})
print(content_lis)

但是,出现以下错误:

"((...)第12行,在soup = BeautifulSoup(page,'html.parser')NameError:名称"page"未定义"

"(...) line 12, in soup = BeautifulSoup(page, 'html.parser') NameError: name 'page' is not defined"

推荐答案

如果打印异常详细信息,您将看到正在发生的事情:

If you print the exception details you will see what is happening:

try:
    page = urllib.request.urlopen(url)
except Exception as e:
    print(f"An error occurred: {e}")

输出

An error occurred: HTTP Error 403: Forbidden
Traceback (most recent call last):
  File ".../main.py", line 12, in <module>
    soup = BeautifulSoup(page, 'html.parser')
NameError: name 'page' is not defined

urlopen()引发一个异常,该异常导致未定义的页面"变量.在这种情况下,它是403,这意味着您可能需要添加身份验证才能访问此URL.

urlopen() is raising an Exception which results in an undefined 'page' variable. In this case it's a 403 which means you may need to add authentication in order to access this URL.

更新:

403响应表示无法以您尝试访问的方式访问此URL.

A 403 response means there is no way to access this URL in the way that you are trying to access it.

https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/403

这篇关于使用Python和Beautiful汤进行网页抓取:错误“未定义“页面""的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆