使用 data.table 创建一列回归系数 [英] Using data.table to create a column of regression coefficients
问题描述
我正在努力解决似乎应该是我问过的上一个问题的简单扩展 这里.
I'm struggling with what seems like it should be a simple extension of a previous question I'd asked here.
我正在尝试汇总 (a) 日期范围和 (b) 因子变量.样本数据可能是:
I'm trying to aggregate over (a) a range of dates and (b) a factor variable. Sample data might be:
Brand Day Rev RVP
A 1 2535.00 195.00
B 1 1785.45 43.55
C 1 1730.87 32.66
A 2 920.00 230.00
B 2 248.22 48.99
C 3 16466.00 189.00
A 1 2535.00 195.00
B 3 1785.45 43.55
C 3 1730.87 32.66
A 4 920.00 230.00
B 5 248.22 48.99
C 4 16466.00 189.00
感谢有用的建议,我知道如何使用 data.table 找到品牌在几天内的平均收入:
Thanks to helpful advice, I've figured out how to find the mean revenue for brands over a period of days using data.table:
new_df<-df[,(mean(Rev)), by=list(Brand,Day)]
现在,我想创建一个新表,其中有一列列出了每个品牌的 Rev by Day 的 OLS 回归的系数估计值.我尝试这样做:
Now, I'd like to create a new table where there is a column listing the coefficient estimate from an OLS regression of Rev by Day for each brand. I tried to do this as follows:
new_df2<-df[,(lm(Rev~Day)), by=list(Brand)]
这似乎不太正确.想法?我敢肯定,我错过了一些明显的东西.
That doesn't seem quite right. Thoughts? I'm sure it's something obvious I've missed.
推荐答案
我想这就是你想要的:
new_df2<-df[,(lm(Rev~Day)$coefficients[["Day"]]), by=list(Brand)]
lm
返回一个完整的模型对象,您需要深入研究它以从每个组中获取一个可以变成一列的值.
lm
returns a full model object, you need to drill down into it to get a single value from each group that can be turned into a column.
这篇关于使用 data.table 创建一列回归系数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!