如果要执行多个if条件,如何使用apply函数而不是for循环 [英] How to use apply function instead of for loop if you have multiple if conditions to be excecuted

查看:326
本文介绍了如果要执行多个if条件,如何使用apply函数而不是for循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

第一个DF:

t.d
  V1 V2 V3 V4
1  1  6 11 16
2  2  7 12 17
3  3  8 13 18
4  4  9 14 19
5  5 10 15 20


names(t.d) <- c("ID","A","B","C")

t.d$FinalTime <- c("7/30/2009 08:18:35","9/30/2009 19:18:35","11/30/2009 21:18:35","13/30/2009 20:18:35","15/30/2009 04:18:35")

t.d$InitTime <- c("6/30/2009 9:18:35","6/30/2009 9:18:35","6/30/2009 9:18:35","6/30/2009 9:18:35","6/30/2009 9:18:35")

>t.d
  ID  A  B  C           FinalTime          InitTime
1  1  6 11 16  7/30/2009 08:18:35 6/30/2009 9:18:35
2  2  7 12 17  9/30/2009 19:18:35 6/30/2009 9:18:35
3  3  8 13 18 11/30/2009 21:18:35 6/30/2009 9:18:35
4  4  9 14 19 13/30/2009 20:18:35 6/30/2009 9:18:35
5  5 10 15 20 15/30/2009 04:18:35 6/30/2009 9:18:35

第二个DF:

> s.d
   F  D  E                Time
1  10 19 28  6/30/2009 08:18:35
2  11 20 29  8/30/2009 19:18:35
3  12 21 30  9/30/2009 21:18:35
4  13 22 31 01/30/2009 20:18:35
5  14 23 32 10/30/2009 04:18:35
6  15 24 33 11/30/2009 04:18:35
7  16 25 34 12/30/2009 04:18:35
8  17 26 35 13/30/2009 04:18:35
9  18 27 36 15/30/2009 04:18:35

输出为:

从DF"t.d"中,我必须计算"FinalTime"和"InitTime"之间的每一行的时间间隔(InitTime始终小于FinalTime).

From DF "t.d" I have to calculate the time interval for each row between "FinalTime" and "InitTime" (InitTime will always be less than FinalTime).

"sd"中的另一个DF"temp"必须仅在上述时间间隔内形成数据,然后必须取"F","D","E"的最新值并将其附加到计算时间间隔的"td"的"ith"行.

Another DF "temp" from "s.d" has to be formed having data only within the above time interval, and then the most recent values of "F","D","E" have to be taken and attached to the 'ith' row of "t.d" from which the time interval was calculated.

我们还必须查看新形成的DF"temp"是否满足以下条件:

Also we have to see if the newly formed DF "temp" has the following conditions true:

这里的"j"代表每一行的值:

here 'j' represents value for each row:

if(temp$F[j] < 35.5) + (temp$D[j] >= 100) >= 1)
{
  temp$Flag <- 1
} else{
  temp$Flag <- 0
}

最初我在数据框中有300万行,在每个DF中有20列.

Originally I have 3 million rows in the dataframe and 20 columns in each DF.

我已经使用"for循环"解决了上述问题,但显然要花2到3天,因为有很多行.

I have solved the above problem using "for loop" but it obviously takes 2 to 3 days as there are a lot of rows.

(如果在每一行上都满足多个条件,是否还必须在结果DF中添加新列?)

(Also if I have to add new columns to the resultant DF if multiple conditions get satisfied on each row?)

有人可以建议一种不同的技术吗?喜欢使用套用功能吗?

Can anybody suggest a different technique? Like using apply functions?

推荐答案

我的建议是:

  • 对行索引使用lapply
  • 该函数中的句柄调用您的if分支
  • 返回数据框或NULL
  • 将所有内容与rbind组合
  • 通过从'parallel'包中用mclapply替换lapply,您的代码将并行执行.

  • use lapply over row indices
  • handle in the function call your if branches
  • return either your dataframe or NULL
  • combine everything with rbind
  • by replacing lapply with mclapply from the 'parallel' package, your code gets executed in parallel.

resultList <- lapply(1:nrow(t.d), function(i){
do stuff
if(condition){
    return(df)
}else{
    return(NULL)
}
resultDF <- do.call(rbind, resultList)

这篇关于如果要执行多个if条件,如何使用apply函数而不是for循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆