在 pandas 中创建滚动协方差矩阵 [英] Create rolling covariance matrix in pandas
问题描述
我试图在财务数据上创建一组滚动协方差矩阵(窗口大小= 60).返回值为125x3 df.
I am trying to create a set of rolling covariance matrices on financial data (window size = 60). Returns is a 125x3 df.
import pandas as pd
roll_rets = returns.rolling(window=60)
Omega = roll_rets.cov()
Omega是一个375x3数据帧,看起来像一个多索引-即每个时间戳都有3个值.
Omega is a 375x3 data frame with what looks like a multi-index - i.e. there are 3 values for each timestamp.
我实际上希望返回的是一组66个3x3协方差矩阵(即每个周期一个),但是我不知道如何正确地对收益进行迭代来做到这一点.我想我缺少明显的东西.谢谢.
What I actually want this to return is a set of 66 3x3 covariance matrices (i.e. one for each period), but I can't work out how to iterate over returns correctly to do this. I think I'm missing something obvious. Thanks.
推荐答案
首先:MultiIndex DataFrame是一个可迭代的对象. (尝试bool(pd.DataFrame.__iter__
).如果您有兴趣的话,有几个关于遍历MultiIndex DataFrame子帧的StackOverflow问题.
Firstly: a MultiIndex DataFrame is an iterable object. (Try bool(pd.DataFrame.__iter__
). There are several StackOverflow questions on iterating through the sub-frames of a MultiIndex DataFrame, if you have interest.
但是直接问您的问题,这是一个决定:键是(结束)日期,每个值都是3x3的NumPy数组.
But to your question directly, here is a dict: the keys are the (end) dates, and each value is a 3x3 NumPy array.
import pandas as pd
import numpy as np
Omega = (pd.DataFrame(np.random.randn(125,3),
index=pd.date_range('1/1/2010', periods=125),
columns=list('abc'))
.rolling(60)
.cov()
.dropna()) # this will get you to 66 windows instead of 125 with NaNs
dates = Omega.index.get_level_values(0) # or just the index of your base returns
d = dict(zip(dates, [Omega.loc[date].values for date in dates]))
这有效吗?不,不是.您正在为该字典的每个值创建一个单独的NumPy数组.每个NumPy数组都有其自己的dtype,等等.现在可以说DataFrame非常适合您的目的.但是另一种解决方案是通过扩展Omega.values
的ndim
来创建单个NumPy数组:
Is this efficient? No, not very. You are creating a separate NumPy array for each value of the dict. Each NumPy array has its own dtype, etc. The DataFrame as it is now is arguably well-suited for your purpose. But one other solution is to create a single NumPy array by expanding the ndim
of Omega.values
:
Omega.values.reshape(66, 3, 3)
每个元素都是一个矩阵(同样,很容易迭代,但是丢失了DataFrame中的日期索引).
Here each element is a matrix (again, easily iterable, but loses the date indexing that you had in your DataFrame).
Omega.values.reshape(66, 3, 3)[-1] # last matrix/final date
Out[29]:
array([[ 0.80865977, -0.06134767, 0.04522074],
[-0.06134767, 0.67492558, -0.12337773],
[ 0.04522074, -0.12337773, 0.72340524]])
这篇关于在 pandas 中创建滚动协方差矩阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!