使用dplyr将预测值列添加到数据框 [英] Add Column of Predicted Values to Data Frame with dplyr

查看:128
本文介绍了使用dplyr将预测值列添加到数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含模型列的数据框,我试图添加一列预测值。一个最小的例子是:

  exampleTable<  -  data.frame(x = c(1:5,1:5) 
y = c((1:5)+ rnorm(5),2 *(5:1)),
groups = rep(LETTERS [1:2],each = 5))

模型< - exampleTable%>%group_by(groups)%>%do(model = lm(y〜x,data =。))
exampleTable< - left_join(tbl_df(exampleTable )

估计< - exampleTable%>%rowwise()%>%do(Est = predict(。$ model,newdata =。[x]))

如何将一列数字预测添加到 exampleTable ?我尝试使用 mutate 直接将列添加到表中,而不成功。

 > exampleTable<  -  exampleTable%>%rowwise()%>%mutate(data.frame(Pred = predict(。$ model,newdata =。[x])))
错误:用于'预测'应用于类list的对象

现在我使用 bind_cols 估计值添加到 exampleTable ,但我正在寻找一个更好的解决方案。

 估计<  -  exampleTable%>%rowwise()%>%do(data.frame(Pred = 。$ model,newdata =。[x])))
exampleTable< - bind_cols(exampleTable,estimate)

如何在一个步骤中完成?

解决方案

为了记录,这是无痛的在 data.table

  library(data.table)
setDT(exampleTable)


exampleTable [,估计:=
预测(lm(y〜x),data.frame(x)),
由= groups]

> exampleTable
xy group estimate
1:1 0.3123549 A 0.6826629
2:2 2.7636593 A 1.8297796
3:3 1.7771181 A 2.9768963
4:4 5.2031623 A 4.1240130
5:5 4.8281869 A 5.2711297
6:1 10.0000000 B 10.0000000
7:2 8.0000000 B 8.0000000
8:3 6.0000000 B 6.0000000
9:4 4.0000000 B 4.0000000
10:5 2.0000000 B 2.0000000

如果您在数据上销售。表的清晰度,请查看简介小插曲



此外,您不需要按分组。只是把它当作一个虚拟的互动。如果我记得,这是正确的方法来获得正确的标准错误,无论如何:

  exampleTable [,estimate2:= 
预测(lm(y〜x * factor(组)),
data.frame(x,groups))]
> exampleTable [,all.equal(估计,估计2)]
[1] TRUE


I have a data frame with a column of models and I am trying to add a column of predicted values to it. A minimal example is :

exampleTable <- data.frame(x = c(1:5, 1:5),
                           y = c((1:5) + rnorm(5), 2*(5:1)),
                           groups = rep(LETTERS[1:2], each = 5))

models <- exampleTable %>% group_by(groups) %>% do(model = lm(y ~ x, data = .))
exampleTable <- left_join(tbl_df(exampleTable), models)

estimates <- exampleTable %>% rowwise() %>% do(Est = predict(.$model, newdata = .["x"]))

How can I add a column of numeric predictions to exampleTable ? I tried using mutate to directly add the column to the table without success.

> exampleTable <- exampleTable %>% rowwise() %>% mutate(data.frame(Pred = predict(.$model, newdata = .["x"])))
Error: no applicable method for 'predict' applied to an object of class "list"

Now I use bind_cols to add the estimates to exampleTable but I am looking for a better solution.

estimates <- exampleTable %>% rowwise() %>% do(data.frame(Pred = predict(.$model, newdata = .["x"])))
exampleTable <- bind_cols(exampleTable, estimates)

How can it be done in a single step ?

解决方案

For the record, this is painless in data.table:

library(data.table)
setDT(exampleTable)


exampleTable[ , estimates :=
               predict(lm(y ~ x), data.frame(x)),
             by = groups]

> exampleTable
    x          y groups  estimates
 1: 1  0.3123549      A  0.6826629
 2: 2  2.7636593      A  1.8297796
 3: 3  1.7771181      A  2.9768963
 4: 4  5.2031623      A  4.1240130
 5: 5  4.8281869      A  5.2711297
 6: 1 10.0000000      B 10.0000000
 7: 2  8.0000000      B  8.0000000
 8: 3  6.0000000      B  6.0000000
 9: 4  4.0000000      B  4.0000000
10: 5  2.0000000      B  2.0000000

If you're sold on data.table's clarity as I was, check out the intro vignettes!

Also, you don't really need to group by groups. Just include that as a dummy interaction. If I recall, that's the proper approach to get correct standard errors, anyway:

exampleTable[ , estimates2 :=
               predict(lm(y ~ x * factor(groups)),
                       data.frame(x, groups))]
> exampleTable[ , all.equal(estimates, estimates2)]
[1] TRUE

这篇关于使用dplyr将预测值列添加到数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆