如何遍历pandas数据框的列以运行回归 [英] How to iterate over columns of pandas dataframe to run regression

查看:85
本文介绍了如何遍历pandas数据框的列以运行回归的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我敢肯定这很简单,但是作为python的新手,我在弄清楚如何遍历pandas数据帧中的变量并对每个变量进行回归时遇到了麻烦.

I'm sure this is simple, but as a complete newbie to python, I'm having trouble figuring out how to iterate over variables in a pandas dataframe and run a regression with each.

这是我在做什么:

all_data = {}
for ticker in ['FIUIX', 'FSAIX', 'FSAVX', 'FSTMX']:
    all_data[ticker] = web.get_data_yahoo(ticker, '1/1/2010', '1/1/2015')

prices = DataFrame({tic: data['Adj Close'] for tic, data in all_data.iteritems()})  
returns = prices.pct_change()

我知道我可以像这样进行回归:

I know I can run a regression like this:

regs = sm.OLS(returns.FIUIX,returns.FSTMX).fit()

但是假设我想对数据帧中的每一列执行此操作.特别是,我想在FSTMX上还原FIUIX,然后在FSTMX上还原FSAIX,然后在FSTMX上还原FSAVX.每次回归后,我要存储残差.

but suppose I want to do this for each column in the dataframe. In particular, I want to regress FIUIX on FSTMX, and then FSAIX on FSTMX, and then FSAVX on FSTMX. After each regression I want to store the residuals.

我尝试了以下各种版本,但是我一定弄错了语法:

I've tried various versions of the following, but I must be getting the syntax wrong:

resids = {}
for k in returns.keys():
    reg = sm.OLS(returns[k],returns.FSTMX).fit()
    resids[k] = reg.resid

我认为问题是我不知道如何按键引用return列,所以returns[k]可能是错误的.

I think the problem is I don't know how to refer to the returns column by key, so returns[k] is probably wrong.

任何关于最佳方法的指南将不胜感激.也许我缺少一种常见的熊猫方法.

Any guidance on the best way to do this would be much appreciated. Perhaps there's a common pandas approach I'm missing.

推荐答案

for column in df:
    print(df[column])

这篇关于如何遍历pandas数据框的列以运行回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆