BeautifulSoup find_all()不返回任何数据 [英] BeautifulSoup find_all() returns no data

查看:415
本文介绍了BeautifulSoup find_all()不返回任何数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对Python很陌生.我最近的项目是从投注网站抓取数据.我要抓的是网页上的赔率信息.

I am very new to Python. My recent project is scraping data from a betting website. What I want to scrape is the odds information from the webpage.

这是我的代码

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url = 'http://bet.hkjc.com/default.aspx?url=football/odds/odds_allodds.aspx&lang=CH&tmatchid=120653'

uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

page_soup = soup(page_html, "html.parser")

page_soup.findAll("div",{"class":"oddsAll"})

但结果返回[],没有返回值

but the result return [] , which is none

我应该怎么做才能使代码正常工作?

What should I do to make my code work?

推荐答案

使用JavaScript将URL更新为从此页面加载的页面,其中包含数据,并将tmatchid更新为当前的120998.将div更新为tabe和正确的班级.

Updated the URL to be the page loaded from this page, using JavaScript, which contains the data and updated the tmatchid to be current 120998. Updated div to be tabe and the correct class.

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url = 'http://bet.hkjc.com/football/odds/odds_allodds.aspx?lang=CH&tmatchid=120998'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
tables = page_soup.findAll("table",{"class":"tOdds"})
for table in tables:
    print (table.text)

输出:

燕豪芬青年隊(主隊勝) 和 烏德勒支青年隊(客隊勝)   1.53 4.00 4.60 
  燕豪芬青年隊(主隊勝) 和 烏德勒支青年隊(客隊勝)   1.97 2.45 4.70 
  燕豪芬青年隊[-1](主隊勝) 和 烏德勒支青年隊[+1](客隊勝)   2.45 3.60 2.26 
  球數 大 細  [3/3.5]2.021.70
  球數 大 細   [1.5]2.191.60
     1.44    18.00    2.65   
  0 1 2 3 4 5 6 7+   18.00 6.60 4.10 3.65 4.50 6.70 11.00 14.00 
  單 雙   1.90 1.80 
  主 主 主 和 和 和 客 客 客   主 和 客 主 和 客 主 和 客   2.30 14.00 34.00 4.70 6.50 10.50 19.00 14.00 7.50 

更新以回应评论:

在这种情况下,您需要显示数据的框架的URL.您可以执行以下操作:

In this case you need the URL of the frame showing the data. You can do something like this:

import requests
from bs4 import BeautifulSoup
url = requests.get('http://football.hkjc.com/football/iframe/statistics/head-to-head/summary-iframe.aspx?ci=en-US')
soup = BeautifulSoup(url.content, 'lxml')
divs = soup.findAll('div', {'class':['win', 'draw', 'lose']})
for div in divs:
    print (div.get_text())

输出:

18/03/2018 Italian Division 1 : Benevento 1-2 Cagliari
18/02/2018 Italian Division 1 : Benevento 3-2 Crotone
05/02/2018 Italian Division 1 : Benevento 0-2 Napoli
06/01/2018 Italian Division 1 : Benevento 3-2 Sampdoria
30/12/2017 Italian Division 1 : Benevento 1-0 Chievo
18/12/2017 Italian Division 1 : Benevento 1-2 SPAL
03/12/2017 Italian Division 1 : Benevento 2-2 AC Milan
19/11/2017 Italian Division 1 : Benevento 1-2 Sassuolo
29/10/2017 Italian Division 1 : Benevento 1-5 Lazio
22/10/2017 Italian Division 1 : Benevento 0-3 Fiorentina
31/03/2018 Italian Division 1 : Inter Milan 3-0 Verona
20/02/2018 Italian Division 1 : Lazio 2-0 Verona
11/02/2018 Italian Division 1 : Sampdoria 2-0 Verona
28/01/2018 Italian Division 1 : Fiorentina 1-4 Verona
06/01/2018 Italian Division 1 : Napoli 2-0 Verona
23/12/2017 Italian Division 1 : Udinese 4-0 Verona
14/12/2017 Italian Cup : AC Milan 3-0 Verona
10/12/2017 Italian Division 1 : SPAL 2-2 Verona
30/11/2017 Italian Cup : Chievo 1-1 Verona
26/11/2017 Italian Division 1 : Sassuolo 0-2 Verona

这篇关于BeautifulSoup find_all()不返回任何数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆