使用 data.table 创建一列回归系数 [英] Using data.table to create a column of regression coefficients

查看:16
本文介绍了使用 data.table 创建一列回归系数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在努力解决似乎应该是我问过的上一个问题的简单扩展 这里.

I'm struggling with what seems like it should be a simple extension of a previous question I'd asked here.

我正在尝试汇总 (a) 日期范围和 (b) 因子变量.样本数据可能是:

I'm trying to aggregate over (a) a range of dates and (b) a factor variable. Sample data might be:

Brand    Day     Rev     RVP              
  A      1        2535.00  195.00 
  B      1        1785.45  43.55 
  C      1        1730.87  32.66 
  A      2        920.00   230.00
  B      2        248.22   48.99 
  C      3        16466.00 189.00      
  A      1        2535.00  195.00 
  B      3        1785.45  43.55 
  C      3        1730.87  32.66 
  A      4        920.00   230.00
  B      5        248.22   48.99 
  C      4        16466.00 189.00

感谢有用的建议,我知道如何使用 data.table 找到品牌在几天内的平均收入:

Thanks to helpful advice, I've figured out how to find the mean revenue for brands over a period of days using data.table:

new_df<-df[,(mean(Rev)), by=list(Brand,Day)]

现在,我想创建一个新表,其中有一列列出了每个品牌的 Rev by Day 的 OLS 回归的系数估计值.我尝试这样做:

Now, I'd like to create a new table where there is a column listing the coefficient estimate from an OLS regression of Rev by Day for each brand. I tried to do this as follows:

new_df2<-df[,(lm(Rev~Day)), by=list(Brand)]

这似乎不太正确.想法?我敢肯定,我错过了一些明显的东西.

That doesn't seem quite right. Thoughts? I'm sure it's something obvious I've missed.

推荐答案

我想这就是你想要的:

new_df2<-df[,(lm(Rev~Day)$coefficients[["Day"]]), by=list(Brand)]

lm 返回一个完整的模型对象,您需要深入研究它以从每个组中获取一个可以变成一列的值.

lm returns a full model object, you need to drill down into it to get a single value from each group that can be turned into a column.

这篇关于使用 data.table 创建一列回归系数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆