data.table vs plyr回归输出 [英] data.table vs plyr regression output

查看:131
本文介绍了data.table vs plyr回归输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

data.table包在速度方面非常有帮助。但我有麻烦实际使用的线性回归的输出。有一个简单的方法来获得data.table输出是漂亮/有用的,从plyr包吗?下面是一个例子。谢谢!

The data.table package is very helpful in terms of speed. But I am having trouble actually using the output from a linear regression. Is there an easy way to get the data.table output to be as pretty/useful as that from the plyr package? Below is an example. Thank you!

library('data.table');
library('plyr');

REG <- data.table(ID=c(rep('Frank',5),rep('Tony',5),rep('Ed',5)), y=rnorm(15), x=rnorm(15), z=rnorm(15));
REG;

ddply(REG, .(ID), function(x) coef(lm(y ~ x + z, data=x)));

REG[, coef(lm(y ~ x + z)), by=ID];

data.table系数估计值输出在一列,而ply​​r / ddply系数估计值输出在多个和很好地标记的列。

The data.table coefficient estimates are output in a single column whereas the plyr/ddply coefficient estimates are output in multiple and nicely labeled columns.

我知道我可以运行回归三次与data.table,但似乎真的效率低下。我可能错了。

I know I can run the regression three times with data.table but that seems really inefficient. I could be wrong, though.

REG[, Intercept=coef(lm(y ~ x + z))[1],
      x        =coef(lm(y ~ x + z))[2],
      z        =coef(lm(y ~ x + z))[3], by=ID];


推荐答案

尝试:

> REG[, as.list(coef(lm(y ~ x + z))), by=ID];
        ID (Intercept)           x         z
[1,] Frank  -0.2928611  0.07215896  1.835106
[2,]  Tony   0.9120795 -1.11153056  2.041260
[3,]    Ed   1.0498359  5.77131778 -1.253741

我有一个尴尬的感觉,这个问题是不到一个星期前被问的,但我不认为我达到这种方法,当我试一试,我不记得比任何答案是这个契约。

I have the nagging feeling that this question was asked less than a week ago, but I don't think I arrived at this approach when I tried it and I don't remember than any answer was this compact.

哦,有r-help的..马修可以评论这个的合法性,如果他想。我想消息是函数返回列表将不会有维度删除。有趣的是使用列表(coef(lm(...))没有以我们希望的方式成功。

Oh, there it is .. on r-help. Matthew can comment on the rightfulness of this if he wants. I guess the message is that functions returning lists will not have dimensions dropped. The interesting thing was the using list(coef(lm(...)) did not succeed in the manner we hoped.

这篇关于data.table vs plyr回归输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆