ResultSet对象没有属性"get" [英] ResultSet object has no attribute 'get'

查看:61
本文介绍了ResultSet对象没有属性"get"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好,我目前正在尝试将此 https://www.sec.gov/ix?doc=/Archives/edgar/data/1090727/000109072720000003/form8-kq42019earningsr.htm SEC链接与beautifulsoup一起获得包含"UPS"的链接"

Hi I'm currently trying to scrape this https://www.sec.gov/ix?doc=/Archives/edgar/data/1090727/000109072720000003/form8-kq42019earningsr.htm SEC link with beautifulsoup to get the link containing "UPS"

pressting = soup3.find_all("a", string="UPS")
linkkm = pressting.get('href')
print(linkkm)

但是当我这样做时,我会收到此错误:

But when I do this I get this error:

Traceback (most recent call last):
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python36\SEC.py", line 55, in <module>
    print('Price: ' + str(edgar()))
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python36\SEC.py", line 46, in edgar
    linkkm = pressting.get('href')
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python36\lib\site-packages\bs4\element.py", line 2081, in __getattr__
    "ResultSet object has no attribute '%s'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?" % key
AttributeError: ResultSet object has no attribute 'get'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?

我的预期结果是提取href,然后打印该href.任何帮助将不胜感激.

My expected result is to exract the href and then print that href. Any help would be appreciated.

推荐答案

基本上,页面一旦加载,就会通过 JavaScript 动态呈现.因此,在您首先渲染对象之前,您将无法解析对象.因此, requests 模块将不会呈现 JavaScript .

Basically the page is dynamically rendered via JavaScript once it's loads. so you will not be able to parse the objects until you render it firstly. Therefore requests module will not render the JavaScript.

您可以使用方法来实现.否则,您可以使用 html_request 模块中的 HTMLSession 进行动态渲染.

You can use selenium approach to achieve that. otherwise you can use HTMLSession from html_request module to render it on the fly.

from selenium import webdriver
from selenium.webdriver.firefox.options import Options
from bs4 import BeautifulSoup
import re
from time import sleep

options = Options()
options.add_argument('--headless')
driver = webdriver.Firefox(options=options)

driver.get("https://www.sec.gov/ix?doc=/Archives/edgar/data/1090727/000109072720000003/form8-kq42019earningsr.htm")

sleep(1)
soup = BeautifulSoup(driver.page_source, 'html.parser')

for item in soup.findAll("a", style=re.compile("^text")):
    print(item.get("href"))

driver.quit()

输出:

https://www.sec.gov/Archives/edgar/data/1090727/000109072720000003/exhibit991-q42019earni.htm
https://www.sec.gov/Archives/edgar/data/1090727/000109072720000003/exhibit992-q42019finan.htm

但是,如果您只需要第一个网址;

However if you want just the first url;

url = soup.find("a", style=re.compile("^text")).get("href")
print(url)

输出:

https://www.sec.gov/Archives/edgar/data/1090727/000109072720000003/exhibit991-q42019earni.htm

这篇关于ResultSet对象没有属性"get"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆