如何将两个线性回归预测模型(每个数据帧的子集)合并成一个数据帧的一个集合 [英] how to merge two linear regression prediction models (each per data frame's subset) into one colmn of the data frame
问题描述
这是我的数据框架示例:
dat < - read.table(text =猫鸟狼
0 3 8 7
1 3 8 7
1 1 2 3
0 1 2 3
0 1 2 3
1 6 1 1
0 6 1 1
1 6 1 1,header = TRUE)
我已经建立了两个模型:
#一个是狼的〜蛇,其中cats = 0
f0 <-lm wolfs〜snakes,data = dat,subset = dat $ cats == 0)
#第二个模型是针对狼的〜蛇,其中cats = 1
f1< -lm(wolfs〜snakes ,data = dat,subset = dat $ cats == 1)
然后我每个都做了预测模型:
f0_predict< -predict(f0,data = dat,subset = dat $ cats == 1,type ='response ')
f1_predict< -predict(f1,data = dat,subset = dat $ cats == 0,type ='response' )
这工作正常,但是我找不到一种方法将其插入原始数据框架,如果cats == 0,我将得到模型的预测值,其中cats == 0,如果cat == 1,我将得到模型的预测值,其中cats == 1在同一列中命名为full_prediction。
例如输出应该是(具有伪预测值):
猫鸟狼狼snape full_prediction
0 3 8 7 0.6
1 3 8 7 0.5
1 1 2 3 0.4
0 1 2 3 0.3
0 1 2 3 0.3
1 6 1 1 0.7
0 6 1 1 0.1
1 6 1 1 0.7
如果你看在6-8行,你可以看到,full_prediction的值为0.7,对于cats == 1和0.1为cats == 0
任何想法如何做这样的事情?
使用 split
和 unsplit
dat.l< - split(dat,dat $ cats)
dat.l< - lapply (dat.l,function(x){
mod< - lm(wolfs〜snakes,data = x)
x $ full_prediction< - 预测(mod,data = x,type ='response')
return(x)
})
unsplit dat.l,dat $ cats)
输出:
猫鸟狼狼full_prediction
1 0 3 8 7 7.5789474
2 1 3 8 7 7.6666667
3 1 1 2 3 3.0000000
4 0 1 2 3 2.6315789
5 0 1 2 3 2.6315789
6 1 6 1 1 0.6666667
7 0 6 1 1 0.1578947
8 1 6 1 1 0.6666667
一个 dplyr
解决方案将是:
require(dplyr)
dat%>%
group_by(cats)%>%
({
mod< - lm(wolfs〜snakes,data =。)
pred< - predict(mod)
data.frame(。,pred)
})
I would like to build 2 linear regression models that are based on 2 subsets of the dataset and then to have one column that contians the prediction values per each subset. Here is my data frame example :
dat <- read.table(text = " cats birds wolfs snakes
0 3 8 7
1 3 8 7
1 1 2 3
0 1 2 3
0 1 2 3
1 6 1 1
0 6 1 1
1 6 1 1 ",header = TRUE)
First I have built two models:
# one is for wolfs ~ snakes where cats=0
f0<-lm(wolfs~snakes,data=dat,subset=dat$cats==0)
#the second model is for wolfs ~ snakes where cats=1
f1<-lm(wolfs~snakes,data=dat,subset=dat$cats==1)
I then did the prediction per each model:
f0_predict<-predict(f0,data=dat,subset=dat$cats==1,type='response')
f1_predict<-predict(f1,data=dat,subset=dat$cats==0,type='response')
This works fine, but I can't find a way to insert it back to the original data frame in such a way that if cats==0 I'll get the prediction value of the model for rows where cats==0 and if cat==1 I'll get the prediction value of the model for rows where cats==1 in the same column named: full_prediction. for example the output should be (with Pseudo prediction values) :
cats birds wolfs snakes full_prediction
0 3 8 7 0.6
1 3 8 7 0.5
1 1 2 3 0.4
0 1 2 3 0.3
0 1 2 3 0.3
1 6 1 1 0.7
0 6 1 1 0.1
1 6 1 1 0.7
If you look at rows number 6-8 you can see that the value of the full_prediction is 0.7 for cats==1 and 0.1 for cats==0 Any Idea how to do such a thing?
Use split
and unsplit
dat.l <- split(dat, dat$cats)
dat.l <- lapply(dat.l, function(x){
mod <- lm(wolfs~snakes,data=x)
x$full_prediction <- predict(mod,data=x,type='response')
return(x)
})
unsplit(dat.l, dat$cats)
Output:
cats birds wolfs snakes full_prediction
1 0 3 8 7 7.5789474
2 1 3 8 7 7.6666667
3 1 1 2 3 3.0000000
4 0 1 2 3 2.6315789
5 0 1 2 3 2.6315789
6 1 6 1 1 0.6666667
7 0 6 1 1 0.1578947
8 1 6 1 1 0.6666667
A dplyr
solution would be:
require(dplyr)
dat %>%
group_by(cats) %>%
do({
mod <- lm(wolfs~snakes, data = .)
pred <- predict(mod)
data.frame(., pred)
})
这篇关于如何将两个线性回归预测模型(每个数据帧的子集)合并成一个数据帧的一个集合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!