如果要执行多个 if 条件,如何使用 apply 函数而不是 for 循环 [英] How to use apply function instead of for loop if you have multiple if conditions to be excecuted

查看:37
本文介绍了如果要执行多个 if 条件,如何使用 apply 函数而不是 for 循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

第一个 DF:

t.d
  V1 V2 V3 V4
1  1  6 11 16
2  2  7 12 17
3  3  8 13 18
4  4  9 14 19
5  5 10 15 20


names(t.d) <- c("ID","A","B","C")

t.d$FinalTime <- c("7/30/2009 08:18:35","9/30/2009 19:18:35","11/30/2009 21:18:35","13/30/2009 20:18:35","15/30/2009 04:18:35")

t.d$InitTime <- c("6/30/2009 9:18:35","6/30/2009 9:18:35","6/30/2009 9:18:35","6/30/2009 9:18:35","6/30/2009 9:18:35")

>t.d
  ID  A  B  C           FinalTime          InitTime
1  1  6 11 16  7/30/2009 08:18:35 6/30/2009 9:18:35
2  2  7 12 17  9/30/2009 19:18:35 6/30/2009 9:18:35
3  3  8 13 18 11/30/2009 21:18:35 6/30/2009 9:18:35
4  4  9 14 19 13/30/2009 20:18:35 6/30/2009 9:18:35
5  5 10 15 20 15/30/2009 04:18:35 6/30/2009 9:18:35

第二个DF:

> s.d
   F  D  E                Time
1  10 19 28  6/30/2009 08:18:35
2  11 20 29  8/30/2009 19:18:35
3  12 21 30  9/30/2009 21:18:35
4  13 22 31 01/30/2009 20:18:35
5  14 23 32 10/30/2009 04:18:35
6  15 24 33 11/30/2009 04:18:35
7  16 25 34 12/30/2009 04:18:35
8  17 26 35 13/30/2009 04:18:35
9  18 27 36 15/30/2009 04:18:35

输出为:

从 DF "t.d" 我必须计算 "FinalTime" 和 "InitTime" 之间每一行的时间间隔(InitTime 总是小于 FinalTime).

From DF "t.d" I have to calculate the time interval for each row between "FinalTime" and "InitTime" (InitTime will always be less than FinalTime).

来自sd"的另一个DFtemp"必须仅在上述时间间隔内具有数据,然后必须采用F"、D"、E"的最新值并将其附加到从中计算时间间隔的td"的第 'ith' 行.

Another DF "temp" from "s.d" has to be formed having data only within the above time interval, and then the most recent values of "F","D","E" have to be taken and attached to the 'ith' row of "t.d" from which the time interval was calculated.

我们还要看看新形成的DFtemp"是否满足以下条件:

Also we have to see if the newly formed DF "temp" has the following conditions true:

这里 'j' 代表每一行的值:

here 'j' represents value for each row:

if(temp$F[j] < 35.5) + (temp$D[j] >= 100) >= 1)
{
  temp$Flag <- 1
} else{
  temp$Flag <- 0
}

最初我在数据框中有 300 万行,每个 DF 中有 20 列.

Originally I have 3 million rows in the dataframe and 20 columns in each DF.

我已经使用for 循环"解决了上述问题,但显然需要 2 到 3 天,因为有很多行.

I have solved the above problem using "for loop" but it obviously takes 2 to 3 days as there are a lot of rows.

(此外,如果每行满足多个条件,我是否必须向结果 DF 添加新列?)

(Also if I have to add new columns to the resultant DF if multiple conditions get satisfied on each row?)

有人可以建议一种不同的技术吗?喜欢使用应用函数吗?

Can anybody suggest a different technique? Like using apply functions?

推荐答案

我的建议是:

  • 在行索引上使用重叠
  • 在函数中处理你的 if 分支
  • 返回您的数据框或 NULL
  • 将所有内容与 rbind 结合
  • 通过将 lapply 替换为 'parallel' 包中的 mclapply,您的代码将并行执行.

  • use lapply over row indices
  • handle in the function call your if branches
  • return either your dataframe or NULL
  • combine everything with rbind
  • by replacing lapply with mclapply from the 'parallel' package, your code gets executed in parallel.

resultList <- lapply(1:nrow(t.d), function(i){
do stuff
if(condition){
    return(df)
}else{
    return(NULL)
}
resultDF <- do.call(rbind, resultList)

这篇关于如果要执行多个 if 条件,如何使用 apply 函数而不是 for 循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆