R:如何编写一个循环来获得矩阵? [英] R: how to write a loop to get a matrix?

查看:186
本文介绍了R:如何编写一个循环来获得矩阵?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



为了建立在这个答案,我试图写一个循环来获得所有成对的序列相似性分数为1000具有以下代码的蛋白质。

  for(i in 1:1000){
score < - score(pairwiseAlignment(seqs [[i] ] $ seq,seqs [[i + 1]] $ seq,substitutionMatrix = BLOSUM100,gapOpening = 0,gapExtension = -5))}

但是,我很难将每个分数转换为 data.frame ,就像这样自动列出所有分数?

  seq1 seq2得分
seq1 seq3得分
seq1 seq4得分
....
seq1000 seq1000得分

请问专家如何获得1000 x 1000蛋白质?

这似乎是一个你可以用expand.grid做的任务,并应用:

  seqs< -c(seq1,seq2,seq3); (seqs,seqs,stringsAsFactors = FALSE)
dat
apply(dat,1,function(seq)paste(seq [1],seq [2],sep = - ))
#[1]seq1 - seq1seq2 - seq1seq3 - seq1seq1 - seq2seq2 - seq2seq3 - seq2 seq1 - seq3
#[8]seq2 - seq3seq3 - seq3


$ b $如果函数为f(seq1,seq2)返回与f(seq2,seq1)相同的值,那么肯定会有重复的努力,但是如果您想提高效率,则可以限制第一个参数的应用:

  datr < -  dat [dat [,1]>所以如果你做了这样一个限制行数据框, datr ,那么也许: 

  datr $ score<  -  apply(datr, 1,函数(seq){
score(pairwiseAlignment(seq [1],seq [2],
substitutionMatrix = BLOSUM100,gapOpening = 0,gapExtension = -5))}
< code


$(不知道最后一行的参数,你应该学会在你的例子中加入一些真实的数据,使用 library require 调用列出所需的软件包。)


Thanks to the wonderful solution suggested by diliop for my previous question.

How to get pair-wise "sequence similarity score" for ~1000 proteins?

To build upon this answer, I tried to write a loop to get all the pair-wise "sequence similarity score" for 1000 proteins with the following code.

for (i in 1:1000){
score <- score(pairwiseAlignment(seqs[[i]]$seq, seqs[[i+1]]$seq, substitutionMatrix=BLOSUM100, gapOpening=0, gapExtension=-5))}

However, it is very difficult for me to convert each score to a data.frame, like this that list out all the score automatically?

seq1 seq2 score
seq1 seq3 score
seq1 seq4 score
....
seq1000 seq1000 score

Could expert give me some more hints how to get 1000 x 1000 proteins?

解决方案

This appears to be a task that you can do with expand.grid and apply:

seqs <-c("seq1","seq2","seq3"); dat <- expand.grid(seqs,seqs, stringsAsFactors=FALSE)
dat
apply(dat, 1, function(seq) paste(seq[1], seq[2], sep="--") )
#[1] "seq1--seq1" "seq2--seq1" "seq3--seq1" "seq1--seq2" "seq2--seq2" "seq3--seq2" "seq1--seq3"
#[8] "seq2--seq3" "seq3--seq3"

Admittedly there is duplication of effort if the function returns the same value for f(seq1,seq2) as for f(seq2,seq1), but if you wanted to gain efficiency you could limit the first argument to apply:

 datr <- dat[dat[,1] > dat[,2] , ]

So if you made such a restricted-row-dataframe, datr, then perhaps:

datr$score <-  apply(datr, 1 , function(seq) {
                     score(pairwiseAlignment( seq[1], seq[2], 
                     substitutionMatrix=BLOSUM100, gapOpening=0, gapExtension=-5)) }

(Not knowing anything about the arguments in the last line. You really should learn to put in some real data in your examples and to list the required packages with library or require calls.)

这篇关于R:如何编写一个循环来获得矩阵?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆