使用BeautifulSoup搜索Yahoo Finance [英] Using BeautifulSoup to Search Through Yahoo Finance
问题描述
我正在尝试从关键统计信息"页面中获取有关Yahoo中的代码的信息(因为Pandas库中不支持此功能).
I'm trying to pull information from the 'Key Statistics' page for a ticker in Yahoo (since this isn't supported in the Pandas library).
AAPL示例:
from bs4 import BeautifulSoup
import requests
url = 'http://finance.yahoo.com/quote/AAPL/key-statistics?p=AAPL'
page = requests.get(url)
soup = BeautifulSoup(page.text, 'lxml')
enterpriseValue = soup.findAll('$ENTERPRISE_VALUE', attrs={'class': 'yfnc_tablehead1'}) #HTML tag for where enterprise value is located
print(enterpriseValue)
谢谢安迪!
问题:这正在打印一个空数组.如何更改我的findAll
以返回598.56B
?
Question: This is printing an empty array. How do I change my findAll
to return 598.56B
?
推荐答案
好吧,find_all
返回的列表为空的原因是,该数据是通过单独的调用生成的,仅通过发送GET
对该网址的请求.如果您在Chrome/Firefox上浏览网络"标签并通过XHR进行过滤,则通过检查每个网络操作的请求和响应,您还可以找到应该发送GET
请求的URL.
Well, the reason the list that find_all
returns is empty is because that data is generated with a separate call that isn't completed by just sending a GET
request to that URL. If you look through the Network tab on Chrome/Firefox and filter by XHR, by examining the requests and responses of each network action, you can find what you URL you ought to be sending the GET
request too.
在这种情况下,它是https://query2.finance.yahoo.com/v10/finance/quoteSummary/AAPL?formatted=true&crumb=8ldhetOu7RJ&lang=en-US®ion=US&modules=defaultKeyStatistics%2CfinancialData%2CcalendarEvents&corsDomain=finance.yahoo.com
,正如我们在此处看到的那样:
In this case, it's https://query2.finance.yahoo.com/v10/finance/quoteSummary/AAPL?formatted=true&crumb=8ldhetOu7RJ&lang=en-US®ion=US&modules=defaultKeyStatistics%2CfinancialData%2CcalendarEvents&corsDomain=finance.yahoo.com
, as we can see here:
那么,我们如何重新创建它呢?简单的! :
So, how do we recreate this? Simple! :
from bs4 import BeautifulSoup
import requests
r = requests.get('https://query2.finance.yahoo.com/v10/finance/quoteSummary/AAPL?formatted=true&crumb=8ldhetOu7RJ&lang=en-US®ion=US&modules=defaultKeyStatistics%2CfinancialData%2CcalendarEvents&corsDomain=finance.yahoo.com')
data = r.json()
这将以dict
的形式返回JSON
响应.从那里浏览dict
,直到找到需要的数据为止:
This will return the JSON
response as a dict
. From there, navigate through the dict
until you find the data you're after:
financial_data = data['quoteSummary']['result'][0]['defaultKeyStatistics']
enterprise_value_dict = financial_data['enterpriseValue']
print(enterprise_value_dict)
>>> {'fmt': '598.56B', 'raw': 598563094528, 'longFmt': '598,563,094,528'}
print(enterprise_value_dict['fmt'])
>>> '598.56B'
这篇关于使用BeautifulSoup搜索Yahoo Finance的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!