迭代调用pandas datareader [英] iteratively calling pandas datareader
问题描述
我有一个带有股票列表的 python 字典.我试图在 for 循环中使用键(股票代码,见下文)为所有股票(以股票代码命名)迭代制作多个 Pandas DataFrame,这些股票通过 Pandas DataReader 填充价格/交易量.我想我在下面的代码中有一个基本的 python 问题,因为创建的唯一数据帧是stockName".感谢您的帮助
I have a python dict with a list of stocks. I seek to use the keys (the stock symbol, see below) in a for loop to iteratively make multiple pandas DataFrames for all stocks (named for the stock symbol) that is filled with price/volume via the pandas DataReader. I think I have a basic python issue in code below because the only DataFrame that gets created is "stockName". Thanks for your help
print stocks.keys()
['TSO', 'WDC', 'EBIX', 'AAPL', 'GTAT', 'MSFT', 'BKE', 'VFSTX', 'ORCL', 'UIS', 'HSII', 'PETS', 'BBBY', 'RPXC', 'TZOO', 'DLB', 'SPLS', 'CHE', 'INTC', 'CF', 'GTN', 'FFIV', 'ATML', 'BAH', 'DHX', 'HRB', 'VIAB', 'LMT', 'NOC', 'VWO', 'ROST']
for stockName in stocks.keys():
stockName = DataReader(stockName, "yahoo", datetime(2013,1,1), datetime(2013,8,1))
推荐答案
如果你只是迭代股票,你可以直接用 stocks
If you're only iterating over stocks you can call it directly with stocks
DataReader(stocks, 'yahoo', datetime(2013, 1, 1), datetime(2013, 8, 1))
您不需要迭代,因为 get_data_yahoo
已经为您完成了.你会得到一个 Panel
,你可以像 DataFrame
的 dict
一样使用它.你甚至不需要调用 stocks.keys()
因为
You don't need to iterate since get_data_yahoo
does that for you. You'll get back a Panel
which you can use like a dict
of DataFrame
s. You don't even need to call stocks.keys()
since
for key in dict(a=1, b=2, c=3):
print key
将打印
a
b
c
结果如下:
In [3]: p = DataReader(stocks, 'yahoo', datetime.datetime(2013, 1, 1), datetime.datetime(2013, 8, 1))
In [4]: p
Out[4]:
<class 'pandas.core.panel.Panel'>
Dimensions: 6 (items) x 147 (major_axis) x 31 (minor_axis)
Items axis: Open to Adj Close
Major_axis axis: 2013-01-02 00:00:00 to 2013-08-01 00:00:00
Minor_axis axis: AAPL to WDC
如果您希望能够通过属性访问访问股票代码,请执行
If you want to be able to access the stock symbols via attribute access do
In [7]: p.swapaxes('items', 'minor').AAPL
Out[7]:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 147 entries, 2013-01-02 00:00:00 to 2013-08-01 00:00:00
Data columns (total 6 columns):
Open 147 non-null values
High 147 non-null values
Low 147 non-null values
Close 147 non-null values
Volume 147 non-null values
Adj Close 147 non-null values
dtypes: float64(6)
与填充 dict
相比,操作生成的 Panel
要容易得多用它做点什么.
It's going to be much easier to manipulate the resulting Panel
than to fill a dict
and do something with that.
你可以用它做各种各样有趣的事情.以下是按 metric
、stock
和 date
分组的百分比变化:
There's all sorts of fun stuff you could do with this. Here's the percent change grouped by metric
, stock
, and date
:
In [127]: df = p.to_frame(filter_observations=False)
In [128]: res = df.stack().reset_index()
In [129]: res.columns = ['date', 'metric', 'stock', 'value']
In [130]: res.set_index('date').groupby(['metric', 'stock']).apply(lambda x: x.value.pct_change()).stack()
Out[130]:
metric stock date
Adj Close AAPL 2013-01-03 -0.013
2013-01-04 -0.028
2013-01-07 -0.006
2013-01-08 0.003
2013-01-09 -0.016
2013-01-10 0.012
2013-01-11 -0.006
2013-01-14 -0.036
2013-01-15 -0.032
2013-01-16 0.042
2013-01-17 -0.007
2013-01-18 -0.005
2013-01-22 0.010
2013-01-23 0.018
2013-01-24 -0.124
...
Volume WDC 2013-07-12 -0.083
2013-07-15 -0.179
2013-07-16 -0.302
2013-07-17 -0.168
2013-07-18 0.589
2013-07-19 0.003
2013-07-22 0.049
2013-07-23 0.526
2013-07-24 0.176
2013-07-25 0.616
2013-07-26 -0.363
2013-07-29 -0.357
2013-07-30 0.554
2013-07-31 -0.252
2013-08-01 -0.158
Length: 27010, dtype: float64
天空是pandas
的极限!
这篇关于迭代调用pandas datareader的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!