Beautiful Soup:“ResultSet"对象没有“find_all"属性? [英] Beautiful Soup: 'ResultSet' object has no attribute 'find_all'?

查看:35
本文介绍了Beautiful Soup:“ResultSet"对象没有“find_all"属性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 Beautiful Soup 刮一张简单的桌子.这是我的代码:

导入请求从 bs4 导入 BeautifulSoupurl = 'https://gist.githubusercontent.com/anonymous/c8eedd8bf41098a8940b/raw/c7e01a76d753f6e8700b54821e26ee5dde3199ab/gistfile1.txt'r = requests.get(url)汤 = BeautifulSoup(r.text)表 = 汤.find_all(class_='dataframe')名字 = []姓氏 = []年龄 = []preTestScore = []postTestScore = []对于 table.find_all('tr') 中的行:col = table.find_all('td')column_1 = col[0].string.strip()first_name.append(column_1)column_2 = col[1].string.strip()last_name.append(column_2)column_3 = col[2].string.strip()年龄.附加(第 3 列)column_4 = col[3].string.strip()preTestScore.append(column_4)column_5 = col[4].string.strip()postTestScore.append(column_5)列 = {'first_name':first_name,'last_name':last_name,'age':年龄,'preTestScore':preTestScore,'postTestScore':postTestScore}df = pd.DataFrame(列)df

但是,每当我运行它时,我都会收到此错误:

---------------------------------------------------------------------------AttributeError 回溯(最近一次调用最后一次)<ipython-input-116-a900c2872793>在 <module>()14 postTestScore = []15--->16 对于 table.find_all('tr') 中的行:17 col = table.find_all('td')18AttributeError: 'ResultSet' 对象没有属性 'find_all'

我已经阅读了大约十几个关于这个错误的 StackOverflow 问题,但我不知道我做错了什么.

解决方案

table 变量包含一个列表.您需要对其成员调用 find_all(即使您知道它是一个只有一个成员的列表),而不是整个事物.

<预><代码>>>>类型(表)<class 'bs4.element.ResultSet'>>>>类型(表[0])<class 'bs4.element.Tag'>>>>len(table[0].find_all('tr'))6>>>

I am trying to scrape a simple table using Beautiful Soup. Here is my code:

import requests
from bs4 import BeautifulSoup

url = 'https://gist.githubusercontent.com/anonymous/c8eedd8bf41098a8940b/raw/c7e01a76d753f6e8700b54821e26ee5dde3199ab/gistfile1.txt'
r = requests.get(url)

soup = BeautifulSoup(r.text)
table = soup.find_all(class_='dataframe')

first_name = []
last_name = []
age = []
preTestScore = []
postTestScore = []

for row in table.find_all('tr'):
    col = table.find_all('td')

    column_1 = col[0].string.strip()
    first_name.append(column_1)

    column_2 = col[1].string.strip()
    last_name.append(column_2)

    column_3 = col[2].string.strip()
    age.append(column_3)

    column_4 = col[3].string.strip()
    preTestScore.append(column_4)

    column_5 = col[4].string.strip()
    postTestScore.append(column_5)

columns = {'first_name': first_name, 'last_name': last_name, 'age': age, 'preTestScore': preTestScore, 'postTestScore': postTestScore}
df = pd.DataFrame(columns)
df

However, whenever I run it, I get this error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-116-a900c2872793> in <module>()
     14 postTestScore = []
     15 
---> 16 for row in table.find_all('tr'):
     17     col = table.find_all('td')
     18 

AttributeError: 'ResultSet' object has no attribute 'find_all'

I have read around a dozen StackOverflow questions about this error, and I cannot figure out what I am doing wrong.

解决方案

The table variable contains a list. You would need to call find_all on its members (even though you know it's a list with only one member), not on the entire thing.

>>> type(table)
<class 'bs4.element.ResultSet'>
>>> type(table[0])
<class 'bs4.element.Tag'>
>>> len(table[0].find_all('tr'))
6
>>>

这篇关于Beautiful Soup:“ResultSet"对象没有“find_all"属性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆