新手需要在R中循环lm [英] Novice needs to loop lm in R
问题描述
我是遗传学的博士研究生,我正在尝试使用线性回归对一些遗传数据进行关联分析.在下表中,我将每个特征"与每个"SNP"进行回归分析,还有一个交互术语包括"var"
I'm a PhD student of genetics and I am trying do association analysis of some genetic data using linear regression. In the table below I'm regressing each 'trait' against each 'SNP' There is also a interaction term include as 'var'
我仅使用R已有2周的时间,并且没有任何编程背景,因此请解释一下我想了解的任何帮助.
I've only used R for 2 weeks and I don't have any programming background so please explain any help provided as I want to understand.
这是我的数据示例:
Sample ID var trait 1 trait 2 trait 3 SNP1 SNP2 SNP3
77856517 2 188 3 2 1 0 0
375689755 8 17 -1 -1 1 -1 -1
392513415 8 28 14 4 1 1 1
393612038 8 85 14 6 1 1 0
401623551 8 152 11 -1 1 0 0
348466144 7 -74 11 6 1 0 0
77852806 4 81 16 6 1 1 0
440614343 8 -93 8 0 0 1 0
77853193 5 3 6 5 1 1 1
这是我一直用于单次回归的代码:
and this is the code I've been using for a single regression:
result1 <-lm(trait1~SNP1+var+SNP1*var, na.action=na.exclude)
我想运行一个循环,针对每个SNP测试每个特征.
I want to run a loop where every trait is tested against each SNP.
我一直在尝试修改在线找到的代码,但是我总是遇到一些我不知道如何解决的错误.
I've been trying to modify codes I've found online but I always run into some error that I don't understand how to solve.
感谢您提供任何帮助.
推荐答案
我个人认为问题不那么容易.特别适合R新手.
Personally I don't find the problem so easy. Specially for an R novice.
这里有一个基于动态创建回归公式的解决方案.
想法是使用paste
函数创建不同的公式项,然后使用as.formula
强制使用y~ x + var + x * var
强制结果字符串tp一个公式.此处y
和x
是公式动态项:c(trait1,trai2,..)中的y和c(SNP1,SNP2,...)中的x.当然,这里我使用lapply
进行循环.
Here a solution based on creating dynamically the regression formula.
The idea is to use paste
function to create different formula terms, y~ x + var + x * var
then coercing the result string tp a formula using as.formula
. Here y
and x
are the formula dynamic terms: y in c(trait1,trai2,..) and x in c(SNP1,SNP2,...). Of course here I use lapply
to loop.
lapply(1:3,function(i){
y <- paste0('trait',i)
x <- paste0('SNP',i)
factor1 <- x
factor2 <- 'var'
factor3 <- paste(x,'var',sep='*')
listfactor <- c(factor1,factor2,factor3)
form <- as.formula(paste(y, "~",paste(listfactor,collapse="+")))
lm(formula = form, data = dat)
})
我希望有人能提供更简单的解决方案,或者更多的R-ish解决方案:)
I hope someone come with easier solution, ore more R-ish one:)
编辑
由于@DWin注释,我们可以将公式简化为y~x*var
,因为这意味着y
由x
,var
和x*var
Thanks to @DWin comment , we can simplify the formula to just y~x*var
since it means y
is modeled by x
,var
and x*var
因此,上面的代码将简化为:
So the code above will be simplified to :
lapply(1:3,function(i){
y <- paste0('trait',i)
x <- paste0('SNP',i)
LHS <- paste(x,'var',sep='*')
form <- as.formula(paste(y, "~",LHS)
lm(formula = form, data = dat)
})
这篇关于新手需要在R中循环lm的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!