R,在不同的DV上使用相同的IV存储系数的线性回归的自动循环 [英] R, automated loop of linear regressions using same IVs on different DVs to store coefficients

查看:115
本文介绍了R,在不同的DV上使用相同的IV存储系数的线性回归的自动循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Mtcars对11个变量有32个观测值.假设"mpg","drat","qsec"是相关的因变量.假设"cyl"和"hp"是模型类型1的自变量,而"disp"是模型类型2的自变量.我想自动化一些回归,但是不能执行下面的步骤(2).

Mtcars has 32 observations of 11 variables. Assume that "mpg", "drat", "qsec" are the dependent variables of interest. Assume that "cyl" and "hp" are the independent variables for model type 1 and "disp" is the independent variable for model type 2. I want to automate some regressions but can't do Step (2) below.

我想做什么?

在我的实际数据帧中,我感兴趣的因变量比自变量多得多.

On my actual dataframe, I have many more dependent variables of interest than independent variables.

  1. 我要为以下每个命令运行"lm"或"glm":

  1. I want to run "lm" or "glm" for each of the following:

  • mpg〜cyl + hp,
  • drat〜cyl + hp,
  • qsec〜cyl + hp,
  • mpg〜disp,
  • drat〜disp和
  • qsec〜disp

这似乎是我目前最大的问题.我想制作一个新的数据框(估算值),其中包含系数Estimate和Pr(> | t |),例如(假设已填写),

This seems to be my biggest current problem. I want to make a new dataframe (estimates) that holds coefficient Estimate and Pr(>|t|), e.g. (assuming this were filled out),

IV DV mpg.Est mpg.Pr dr.Est drat.Pr qsec.Est qsec.Pr圆柱-2.26 0.00 .. .. .. ..hp -0.02 0.21 .. .. .. ..disp -0.04 0.00 .. .. ...

然后,我想在估算值"后面添加列,以描述每个IV(圆柱,hp,disp)的Pr值,例如(假设已填写),

Then I want to append columns to "estimates" that describe the Pr values of each IV (cyl, hp, disp), e.g. (assuming this were filled out),

IV统计平均值,Pr中位数,Pr最小值,Pr最大值,Pr圆柱0.03 0.02 0.00 0.18生命值 .. .. .. ..显示.. .. .. ..

尝试

##### Step (1)
## Make the formulae
##   For scale, it would be great to use varlists here:
##     dvvarlist <- c("mpg", "drat", "qsec")
##     ivvarlist <- c("cyl + hp", "disp")
models <- lapply(paste(c("mpg", "mpg", "drat", "drat", "qsec", "qsec"),
    c("cyl + hp", "disp"), sep = "~"), formula)

## Run the regressions
res.models <- lapply(models, FUN = function(x) 
    {summary(lm(formula = x, data = mtcars))})

##### Step (2)
## Spot the coefficients
coefficients(res.models[[1]])

## How to automate grab coefficients from all models?

## How to automate place coefficients in proper location in new dataframe?

##### Step (3)
## Append columns to "estimates"
##   For scale, could again use dvvarlist <- c("mpg", "drat", "qsec")
estimates$mean.Pr <- rowMeans(estimates[ , c("mpg.Est", "drat.Est", "qsec.Est")])

相关链接?

推荐答案

使用基数R:

 data("mtcars")
 y=c("mpg","drat","qsec")
 x=c("cyl+hp","disp")

 A=Map(function(i,j)
   summary(lm(as.formula(paste0(i,"~",j)),data=mtcars))$coef[,c(1,4)],
   rep(y,each=length(x)),x)   

 B=do.call(cbind.data.frame,
      tapply(A,rep(y,each=length(x)),
       function(s){a=do.call(rbind,s);a[row.names(a)!="(Intercept)",]}))
 B
        drat.Estimate drat.Pr(>|t|) mpg.Estimate mpg.Pr(>|t|) qsec.Estimate qsec.Pr(>|t|)
   cyl   -0.318242238  5.528430e-05  -2.26469360 4.803752e-04  -0.005485698   0.981671077
   hp     0.003401029  6.262861e-02  -0.01912170 2.125285e-01  -0.018339365   0.005865329
   disp  -0.003063904  5.282022e-06  -0.04121512 9.380327e-10  -0.006253039   0.013144036

我仍然不清楚第三步需要什么.希望您能进一步阐述.尽管我查看了您的代码,似乎您正在查找系数的平均值,系数的中位数等.我不知道您是否也在查找概率的平均值,max等.计算它们以备不时之需:

It is still not clear to me what the third step needs. I hope you can elaborate further. Although I looked at your code and it seems you are looking for the mean of the coefficients, the median of the coefficients etc.. I do not know if you are looking for the mean ,max, etc of the probabilities also,but I just computed them in case you need them:

  C=split(data.frame(t(B)),rep(c("Estimate","Pr(>|t|)"),length(y)))

  D=lapply(C,function(f)
         matrix(mapply(function(i,j) i(j),
                          rep(c(mean,median,min,max),each=length(f)),f),length(f)))

   cbind(B,do.call(cbind.data.frame,lapply(D,`colnames<-`,c("mean","median","min","max"))))


     drat.Estimate drat.Pr(>|t|) mpg.Estimate mpg.Pr(>|t|) qsec.Estimate qsec.Pr(>|t|) Estimate.mean
cyl   -0.318242238  5.528430e-05  -2.26469360 4.803752e-04  -0.005485698   0.981671077   -0.86280718
hp     0.003401029  6.262861e-02  -0.01912170 2.125285e-01  -0.018339365   0.005865329   -0.01135334
disp  -0.003063904  5.282022e-06  -0.04121512 9.380327e-10  -0.006253039   0.013144036   -0.01684402
       Estimate.median Estimate.min Estimate.max Pr(>|t|).mean Pr(>|t|).median Pr(>|t|).min Pr(>|t|).max
  cyl     -0.318242238  -2.26469360 -0.005485698   0.327402245    4.803752e-04 5.528430e-05   0.98167108
  hp      -0.018339365  -0.01912170  0.003401029   0.093674136    6.262861e-02 5.865329e-03   0.21252847
  disp    -0.006253039  -0.04121512 -0.003063904   0.004383106    5.282022e-06 9.380327e-10   0.01314404

我相信您可以将其转置以在一个屏幕中看到它,而不用向左/向右滚动.如果这有帮助,请告诉我们.谢谢

I believe you can transpose this to see it in one screen instead of scrolling left/right. If this helps let us know. Thank you

这篇关于R,在不同的DV上使用相同的IV存储系数的线性回归的自动循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆