我如何使用lapply函数执行3064回归 [英] How can I do 3064 regressions using the lapply function

查看:62
本文介绍了我如何使用lapply函数执行3064回归的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我开始使用r,并且一直坚持分析我的数据.我有一个具有157列的数据框.第1列是因变量,从第2列到第157列是自变量,但从第2列到第79列则是自变量类型(n = 78),从80到157是另一类型(n = 78).我想执行(78 x 78 = 6084)多个线性回归,一次将模型的第一个自变量固定为2到79列.我可以固定自变量并像这样分别进行回归

Hi i am starting to use r and am stuck on analyzing my data. I have a dataframe that has 157 columns. Column 1 is the dependent variable and from column 2 to 157 they are the independent variables, but from column 2 to column 79 it is a type of independent variable (n = 78) and from 80 to 157 another type (n = 78). I want to perform (78 x 78 = 6084) multiple linear regressions leaving the first independent variable of the model fixed one at a time, from columns 2 to 79. I can fix the independent variable and do the regressions separately like this

lm(Grassland$column1 ~ Grassland$column2 +  x)
lm(Grassland$column1 ~ Grassland$column3 +  x)

lm(Grassland$column1 ~ Grassland$column79 +  x)

我的问题是如何进行3064回归,编写单个代码并仅提取p值<0.05的回归,从而消除不显着的回归?

My question is how can I do the 3064 regressions, writing a single code and extracting only the regressions whose p-value <0.05, eliminating the non-significant regressions?

这是我的代码

library(data.table)

Regressions <- 
data.table(Grassland)[, 
                      .(Lm = lapply(.SD, function(x) summary(lm(Grassland$column1 ~ Grassland$column2 + x)))), .SDcols = 80:157]

Regressions[, lapply(Lm, function(x) coef(x)[, "Pr(>|t|)"])] [2:3] < 0.05       

推荐答案

我们还可以使用reformulate创建公式,然后应用lm

We can also use reformulate to create a formula and then apply the lm

lapply(setdiff(names(mtcars), "mpg"), function(x) 
        lm(reformulate(x, "mpg"), data = mtcars))

这篇关于我如何使用lapply函数执行3064回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆