为什么find_all()不返回完整结果? [英] Why isn't find_all() returning complete results?

查看:53
本文介绍了为什么find_all()不返回完整结果?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

尝试在体育参考"页面上检索4个统计信息框.可以在"tfoot"下找到4个统计信息框(两支球队,基本和高级统计信息).但是,以下代码仅返回页面的基本统计信息框:

Trying to retrieve the 4 stats boxes on a Sports Reference page. The 4 stats boxes (two teams, basic & advanced stats) can be found under "tfoot". However, the following code only returns the basic stats boxes for the page:

import requests
from bs4 import BeautifulSoup

r = requests.get("https://www.sports-reference.com/cbb/boxscores/2016-11-11-
villanova.html")

c = r.content
soup = BeautifulSoup(c)

boxes = soup.find_all("tfoot")
len(boxes)

要检索所有四个框,我需要在代码中指定什么?

What do I need to specify in my code to retrieve all four boxes?

推荐答案

两个表都隐藏在HTML注释中,这些表都可以按以下方式提取:

Two of the tables are hidden inside an HTML comment, these can all be extracted as follows:

import requests
from bs4 import BeautifulSoup, Comment

r = requests.get("https://www.sports-reference.com/cbb/boxscores/2016-11-11-villanova.html")
soup = BeautifulSoup(r.content, 'html.parser')
boxes = list(soup.find_all("tfoot"))

for comment in soup.find_all(string=lambda text:isinstance(text, Comment)):
    if 'tfoot' in comment:
        hidden_soup = BeautifulSoup(comment, 'html.parser')
        boxes.extend(list(hidden_soup.find_all("tfoot")))

data = []        

for box in boxes:
    for tr in box.find_all('tr'):
        data.append([td.text for td in tr.find_all('td')])

for row in data:
    print row

为您提供以下数据:

[u'200', u'19', u'65', u'.292', u'13', u'33', u'.394', u'6', u'32', u'.188', u'4', u'7', u'.571', u'4', u'22', u'26', u'12', u'3', u'0', u'13', u'15', u'48']
[u'200', u'33', u'67', u'.493', u'18', u'26', u'.692', u'15', u'41', u'.366', u'7', u'12', u'.583', u'9', u'41', u'50', u'15', u'8', u'4', u'8', u'14', u'88']
[u'200', u'.351', u'.338', u'.492', u'.108', u'8.9', u'71.0', u'34.2', u'63.2', u'4.0', u'0.0', u'16.0', u'100.0', u'64.0', u'117.3']
[u'200', u'.605', u'.604', u'.612', u'.179', u'29.0', u'91.1', u'65.8', u'45.5', u'10.7', u'12.1', u'10.0', u'100.0', u'117.3', u'64.0']

这篇关于为什么find_all()不返回完整结果?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆