使用 BeautifulSoup 搜索雅虎财经 [英] Using BeautifulSoup to Search Through Yahoo Finance

查看:21
本文介绍了使用 BeautifulSoup 搜索雅虎财经的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从关键统计信息"页面中提取雅虎股票代码的信息(因为 Pandas 库不支持此功能).

I'm trying to pull information from the 'Key Statistics' page for a ticker in Yahoo (since this isn't supported in the Pandas library).

AAPL 示例:

from bs4 import BeautifulSoup
import requests

url = 'http://finance.yahoo.com/quote/AAPL/key-statistics?p=AAPL'
page = requests.get(url)
soup = BeautifulSoup(page.text, 'lxml')

enterpriseValue = soup.findAll('$ENTERPRISE_VALUE', attrs={'class': 'yfnc_tablehead1'}) #HTML tag for where enterprise value is located

print(enterpriseValue)

谢谢安迪!

问题:这是打印一个空数组.如何更改我的 findAll 以返回 598.56B?

Question: This is printing an empty array. How do I change my findAll to return 598.56B?

推荐答案

好吧,find_all 返回的列表为空的原因是因为该数据是通过未完成的单独调用生成的只需向该 URL 发送 GET 请求即可.如果您浏览 Chrome/Firefox 上的网络选项卡并按 XHR 过滤,通过检查每个网络操作的请求和响应,您也可以找到应该发送 GET 请求的 URL.

Well, the reason the list that find_all returns is empty is because that data is generated with a separate call that isn't completed by just sending a GET request to that URL. If you look through the Network tab on Chrome/Firefox and filter by XHR, by examining the requests and responses of each network action, you can find what you URL you ought to be sending the GET request too.

在这种情况下,它是 https://query2.finance.yahoo.com/v10/finance/quoteSummary/AAPL?formatted=true&crumb=8ldhetOu7RJ&lang=en-US&region=US&modules=defaultKeyStatistics%2CfinancialData%2CcalendarEvents&corsDomain=finance.yahoo.com,我们可以在这里看到:

In this case, it's https://query2.finance.yahoo.com/v10/finance/quoteSummary/AAPL?formatted=true&crumb=8ldhetOu7RJ&lang=en-US&region=US&modules=defaultKeyStatistics%2CfinancialData%2CcalendarEvents&corsDomain=finance.yahoo.com, as we can see here:

那么,我们如何重新创建它?简单的!:

So, how do we recreate this? Simple! :

from bs4 import BeautifulSoup
import requests

r = requests.get('https://query2.finance.yahoo.com/v10/finance/quoteSummary/AAPL?formatted=true&crumb=8ldhetOu7RJ&lang=en-US&region=US&modules=defaultKeyStatistics%2CfinancialData%2CcalendarEvents&corsDomain=finance.yahoo.com')
data = r.json()

这会将 JSON 响应作为 dict 返回.从那里,通过 dict 导航,直到找到所需的数据:

This will return the JSON response as a dict. From there, navigate through the dict until you find the data you're after:

financial_data = data['quoteSummary']['result'][0]['defaultKeyStatistics']
enterprise_value_dict = financial_data['enterpriseValue']
print(enterprise_value_dict)
>>> {'fmt': '598.56B', 'raw': 598563094528, 'longFmt': '598,563,094,528'}
print(enterprise_value_dict['fmt'])
>>> '598.56B'

这篇关于使用 BeautifulSoup 搜索雅虎财经的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆