我如何使用lapply函数执行3064回归 [英] How can I do 3064 regressions using the lapply function
问题描述
我开始使用r,并且一直坚持分析我的数据.我有一个具有157列的数据框.第1列是因变量,从第2列到第157列是自变量,但从第2列到第79列则是自变量类型(n = 78),从80到157是另一类型(n = 78).我想执行(78 x 78 = 6084)多个线性回归,一次将模型的第一个自变量固定为2到79列.我可以固定自变量并像这样分别进行回归
Hi i am starting to use r and am stuck on analyzing my data. I have a dataframe that has 157 columns. Column 1 is the dependent variable and from column 2 to 157 they are the independent variables, but from column 2 to column 79 it is a type of independent variable (n = 78) and from 80 to 157 another type (n = 78). I want to perform (78 x 78 = 6084) multiple linear regressions leaving the first independent variable of the model fixed one at a time, from columns 2 to 79. I can fix the independent variable and do the regressions separately like this
lm(Grassland$column1 ~ Grassland$column2 + x)
lm(Grassland$column1 ~ Grassland$column3 + x)
lm(Grassland$column1 ~ Grassland$column79 + x)
我的问题是如何进行3064回归,编写单个代码并仅提取p值<0.05的回归,从而消除不显着的回归?
My question is how can I do the 3064 regressions, writing a single code and extracting only the regressions whose p-value <0.05, eliminating the non-significant regressions?
这是我的代码
library(data.table)
Regressions <-
data.table(Grassland)[,
.(Lm = lapply(.SD, function(x) summary(lm(Grassland$column1 ~ Grassland$column2 + x)))), .SDcols = 80:157]
Regressions[, lapply(Lm, function(x) coef(x)[, "Pr(>|t|)"])] [2:3] < 0.05
推荐答案
我们还可以使用reformulate
创建公式,然后应用lm
We can also use reformulate
to create a formula and then apply the lm
lapply(setdiff(names(mtcars), "mpg"), function(x)
lm(reformulate(x, "mpg"), data = mtcars))
这篇关于我如何使用lapply函数执行3064回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!