如何将statsmodels中的OLS应用于groupby [英] How to apply OLS from statsmodels to groupby

查看:160
本文介绍了如何将statsmodels中的OLS应用于groupby的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我按月在产品上运行OLS.尽管这对于单个产品而言效果很好,但我的数据框包含许多产品.如果我创建一个groupby对象,则OLS会出错.

I am running OLS on products by month. While this works fine for a single product, my dataframe contains many products. If I create a groupby object OLS gives an error.

linear_regression_df:
  product_desc  period_num    TOTALS  
0    product_a     1          53  
3    product_a     2          52 
6    product_a     3          50 
1    product_b     1          44 
4    product_b     2          43 
7    product_b     3          41 
2    product_c     1          36   
5    product_c     2          35 
8    product_c     3          34 


from pandas import DataFrame, Series
import statsmodels.api as sm    

linear_regression_grouped = linear_regression_df.groupby(['product_desc'])
X = linear_regression_grouped['period_num'] 
y = linear_regression_grouped['TOTALS']

model = sm.OLS(y, X)
results = model.fit()

我在sm.OLS()行上收到此错误:

And I get this error on the sm.OLS() line:

ValueError: unrecognized data structures: <class 'pandas.core.groupby.SeriesGroupBy'>

那么我该如何遍历数据框并为每个product_desc应用sm.OLS()?

So how can I go through my dataframe and apply sm.OLS() for each product_desc?

推荐答案

您可以执行以下操作...

You could do something like this ...

import pandas as pd
import statsmodels.api as sm

for products in linear_regression_df.product_desc.unique():
    tempdf = linear_regression_df[linear_regression_df.product_desc == products]
    X = tempdf['period_num']
    y = tempdf['TOTALS']

    model = sm.OLS(y, X)
    results = model.fit()

    print results.params #  Or whatever summary info you want

这篇关于如何将statsmodels中的OLS应用于groupby的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆