R:if语句在循环中 [英] R: if statements in loop

查看:295
本文介绍了R:if语句在循环中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

基本上是关于问题的后续行动.

Basically a followup on this question.

我仍在设法加快R的向量化速度,同时仍在设法了解R的向量化.我已阅读 R地狱

I'm still trying to get a grasp of R's vectorising while trying to speed up a coworkers' code. I've read R inferno and Speed up the loop operation in R.

我的目标是加快以下代码的速度,完整的数据集包含10.000-1.000.000行的〜1000列:

My aim is to speed up the following code, the complete dataset contains ~1000columns by 10.000-1.000.000 rows:

df3 <- structure(c("X", "X", "X", "X", "O", "O", "O", "O", "O", "O", 
"O", "O", "O", "O", "O", "O"), .Dim = c(2L, 8L), .Dimnames = list(
    c("1", "2"), c("pig_id", "code", "DSFASD32", "SDFSD56", 
    "SDFASD12", "SDFSD56342", "SDFASD12231", "SDFASD45442"
    )))

score_1 <- structure(c(0, 0, 0, 0, 0, 0), .Dim = 2:3)


for (i in 1:nrow(df3)) {
  a<-matrix(table(df3[i,3:ncol(df3)]))

  if (nrow(a)==1) {
    score_1[i,1]<-0    #count number of X (error), N (not compared) and O (ok)
    score_1[i,2]<-a[1,1]
  }
  if (nrow(a)==2) {
    score_1[i,1]<-a[1,1]
    score_1[i,2]<-a[2,1]
  }
  if (nrow(a)==3) {
    score_1[i,1]<-a[1,1]
    score_1[i,2]<-a[2,1]
    score_1[i,3]<-a[3,1]
  }                        
}
colnames(score_1) <- c("N", "O", "X")

我一直在尝试自己,但似乎还无法弄清楚. 这是我尝试过的.它显示的输出与上面的代码相同,但是我不确定它是否确实执行相同的操作.我缺少R和我的数据集中的一些见识.

I have been trying myself but can't seem to figure it out yet. Here is what I've tried. It shows the same output as the code above, but I'm not sure if it actually does the same. I'm missing that bit of insight in R and my data set.

我似乎无法让我的代码获得与for循环相同的输出.

I can't seem to get my code get the same output as the for loop.

修改: 为了回应Heroka的回应,我更新了可复制的示例:

In response to Heroka's response I've updated my reproducible example:

for循环的输出:

     [,1] [,2] [,3]
[1,]    0    6    0
[2,]    0    6    0

apply函数的输出:

output of the apply function:

     1 2
[1,] 6 6

推荐答案

由于转换为因数(强制其他字母为零),因此可以在表中提供所需的结果,但计算效率不如仅使用申请表.

This gives you the desired result in the table due to a conversion to a factor (forcing other letters to be zero), but is less computationally efficient than just using apply and table.

res <- t(apply(df3[,-c(1:2)],1,function(x){
  x_f=factor(x, levels=c("N","O","X"))
  return(table(x_f))
}))

> res
  N O X
1 0 6 0
2 0 6 0

对于较小的数据集,可能首先需要融化数据,但是如果有1e6行和100列,则需要大量内存.

For a smaller dataset melting the data first might be an option, but with 1e6 rows and 100 columns you'd need a lot of memory.

这篇关于R:if语句在循环中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆