解析网站与beautifulsoup [英] parsing site with beautifulsoup

查看:96
本文介绍了解析网站与beautifulsoup的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试学习如何使用python解析html 而且我目前卡在汤中.findAll向我返回一个空数组,因此可以找到一些元素 这是我的代码:

i'm trying to learn how to parse html with python and i`m currently stuck with soup.findAll return me an empty array,therefore there are elements which could be found Here is my code:

import requests
import urllib.request
import time
from bs4 import BeautifulSoup
headers = {"User-Agent":'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120 Safari/537.36'}
url = 'https://www.oddsportal.com/matches/tennis/20191114/'

responce = requests.get(url,headers=headers)

soup = BeautifulSoup(responce.text, 'html.parser')

info = soup.findAll('tr', {'class':'odd deactivate'})

print(info)

感谢您的帮助,

推荐答案

显然,该页面仅在浏览器中被调用后才加载奇数"部分.因此,您可以使用 Chrome驱动程序.

Apparently, the page only loades the "odds" parts once it is called in a browser. So you could use Selenium and Chrome driver.

请注意,您需要下载Chrome驱动程序并将其放置在.../python/目录中.确保选择匹配的驱动程序版本,即与您已安装的Chrome浏览器版本相匹配的Chrome驱动程序版本.

Note that you need to download the Chrome driver and place the driver in your .../python/ directory. Make sure you choose a matching driver version, meaning a version of Chrome driver that matches the version of the Chrome browser you have installed.

from bs4 import BeautifulSoup 
from urllib.request import urlopen 
import requests, time, traceback, random, csv, codecs, re, os

# Webdriver
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By

options = webdriver.ChromeOptions()
options.add_argument('log-level=3')
browser = webdriver.Chrome(chrome_options=options)

url = 'https://www.oddsportal.com/matches/tennis/20191114/'
browser.get(url)
soup = BeautifulSoup(browser.page_source, "html.parser")
info = soup.findAll('tr', {'class':'odd deactivate'})
print(info) 

这篇关于解析网站与beautifulsoup的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆