具有日期的数据框子集的Ifelse语句 [英] Ifelse statement with dataframe subset using date

查看:85
本文介绍了具有日期的数据框子集的Ifelse语句的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建一个函数以应用于数据帧中的变量,该变量要从当前观测值向前2天的窗口中更改VarD的值(如果在该日期窗口中始终取值1)。 / p>

数据帧如下所示:

  VarA VarB Date Diff VarD 
1 1 2007-04-09不适用0
1 1 2007-04-10 0 0
1 1 2007-04-11 -2 1
1 1 2007-04-12 0 1
1 1 2007-04-13 2 0
1 1 2007-04-14 0 0
1 1 2007-04-15 -2 1
1 1 2007- 04-16 1 0
1 1 2007-04-17 -4 1
1 1 2007-04-18 0 1
1 1 2007-04-19 0 1
1 1 2007-04-20 0 1

新数据框应如下所示:

  VarA VarB日期差异VarD VarC 
1 1 2007-04-09 NA 0 0
1 1 2007-04-10 0 0 0
1 1 2007-04-11 -2 1 1
1 1 2007-04-12 0 1 1
1 1 2007-04-13 2 0 0
1 1 2007 -04-14 0 0 0
1 1 2007-04-15 -2 1 1
1 1 2007-04-16 1 0 0
1 1 2007-04-17 -4 1 0
1 1 2007-04-18 0 1 0
1 1 2007-04-19 0 1 0
1 1 2007-04-20 0 1 0

我尝试了以下代码:

  db $ VarC<-0 

对于(i在唯一(db $ VarA)中){
对于(j在唯一(db $ VarB)中){
for(n in 1:lenght(db $ Date)){
if(db $ VarD [n] == 0){db $ VarC [n]<-0}
else {db $ VarC [n]<-ifelse(0%in%db [(db $ Date> = n& db $ Date< n + 3,] $ VarC,1,0}
}
}

但是我在VarC中只得到零。我已经检查了没有其他代码,它工作正常。如果运行了完整的代码,则r没有错误。我不知道问题可能在哪里。

解决方案

以下是一些替代方案。第一个避免了一些混乱的索引,但是后两个不需要任何软件包。



1)rollapply 这将应用 VarC 以滚动方式作用于 db $ VarD 的每三个元素。 align = left 表示,当它传递 x 来运行 VarC 表示 x [1] 是当前元素, x [2] 接下来是 x [3] 下一个,即当前元素在最左边。 partial = TRUE 表示,如果没有3个可用元素(最后一个元素和最后一个元素将是这种情况),则只需通过,但还有许多剩余。 / p>

 库(zoo)

VarC <-函数(x)if(all(x [- 1] == 1))0 else x [1]
db $ VarC<-rollapply(db $ VarD,3,VarC,部分= TRUE,对齐=左)

给予:

 > db 
VarA VarB日期差异VarD VarC
1 1 1 2007-04-09 NA 0 0
2 1 1 2007-04-10 0 0 0
3 1 1 2007- 04-11 -2 1 1
4 1 1 2007-04-12 0 1 1
5 1 1 2007-04-13 2 0 0
6 1 1 2007-04-14 0 0 0
7 1 1 2007-04-15 -2 1 1
8 1 1 2007-04-16 1 0 0
9 1 1 2007-04-17 -4 1 0
10 1 1 2007-04-18 0 1 0
11 1 1 2007-04-19 0 1 0
12 1 1 2007-04-20 0 1 0

2)申请或使用 VarC 上面的

  n<-nrow(db)
db $ VarC<-sapply(1:n,函数(i)VarC(db $ VarD [i:min(i + 2,n)]))

3),或从上方使用 n VarC

  db $ VarC<-NA 
for(i in 1:n)db $ VarC [i]<-VarC( db $ Va rD [i:min(i + 2,n)])

注意:可重复形式的输入 db 是:

  Lines< - VarA VarB Date Diff VarD VarC 
1 1 2007-04-09 NA 0 0
1 1 2007-04-10 0 0 0
1 1 2007-04-11 -2 1 1
1 1 2007-04-12 0 1 1
1 1 2007-04-13 2 0 0
1 1 2007-04-14 0 0 0
1 1 2007 -04-15 -2 1 1
1 1 2007-04-16 1 0 0
1 1 2007-04-17 -4 1 0
1 1 2007-04-18 0 1 0
1 1 2007-04-19 0 1 0
1 1 2007-04-20 0 1 0
db<-read.table(text = Lines,header = TRUE)


I am trying to create a function to apply to a variable in a dataframe that, for a windows of 2 days forward from the current observation, change the value of VarD if in that date window it always take the value 1.

The dataframe looks like this:

VarA     VarB     Date         Diff   VarD
 1         1      2007-04-09    NA     0
 1         1      2007-04-10    0      0
 1         1      2007-04-11   -2      1 
 1         1      2007-04-12    0      1  
 1         1      2007-04-13    2      0  
 1         1      2007-04-14    0      0  
 1         1      2007-04-15   -2      1  
 1         1      2007-04-16    1      0  
 1         1      2007-04-17   -4      1  
 1         1      2007-04-18    0      1  
 1         1      2007-04-19    0      1  
 1         1      2007-04-20    0      1  

The new dataframe should look like the following:

VarA     VarB     Date         Diff   VarD  VarC
 1         1      2007-04-09    NA     0      0
 1         1      2007-04-10    0      0      0
 1         1      2007-04-11   -2      1      1 
 1         1      2007-04-12    0      1      1  
 1         1      2007-04-13    2      0      0  
 1         1      2007-04-14    0      0      0  
 1         1      2007-04-15   -2      1      1  
 1         1      2007-04-16    1      0      0  
 1         1      2007-04-17   -4      1      0  
 1         1      2007-04-18    0      1      0  
 1         1      2007-04-19    0      1      0  
 1         1      2007-04-20    0      1      0  

I have tried the following code:

db$VarC <- 0

for (i in unique(db$VarA)) {
 for (j in unique(db$VarB)) {
  for (n in 1 : lenght(db$Date)) {
   if (db$VarD[n] == 0) {db$VarC[n] <- 0}
    else { db$VarC[n] <- ifelse(0 %in% db[(db$Date >=n & db$Date < n+3,]$VarC, 1,0}
}
}

But I obtain just zeroes in VarC. I have checked the code without the else and it works fine. No error by r if the complete code is run. I do not have any clue on where the problem could be.

解决方案

Here are some alternatives. The first one avoids some messy indexing but the last two do not require any packages.

1) rollapply This applies the VarC function in a rolling fashion to each 3 elements of db$VarD. align = "left" says that when it passes x to function VarC that x[1] is the current element, x[2] the next and x[3] the next, i.e. the current element is the leftmost. partial = TRUE says that if there are not 3 elements available (which would be the case for the last and next to last elements) then just pass however many there are remaining.

library(zoo)

VarC <- function(x) if (all(x[-1] == 1)) 0 else x[1]
db$VarC <- rollapply(db$VarD, 3, VarC, partial = TRUE, align = "left")

giving:

> db
   VarA VarB       Date Diff VarD VarC
1     1    1 2007-04-09   NA    0    0
2     1    1 2007-04-10    0    0    0
3     1    1 2007-04-11   -2    1    1
4     1    1 2007-04-12    0    1    1
5     1    1 2007-04-13    2    0    0
6     1    1 2007-04-14    0    0    0
7     1    1 2007-04-15   -2    1    1
8     1    1 2007-04-16    1    0    0
9     1    1 2007-04-17   -4    1    0
10    1    1 2007-04-18    0    1    0
11    1    1 2007-04-19    0    1    0
12    1    1 2007-04-20    0    1    0

2) sapply or using VarC from above:

n <- nrow(db)
db$VarC <- sapply(1:n, function(i) VarC(db$VarD[i:min(i+2, n)]))

3) for or using n and VarC from above:

db$VarC <- NA
for(i in 1:n)  db$VarC[i] <- VarC(db$VarD[i:min(i+2, n)])

Note: The input db in reproducible form is:

Lines <- "VarA     VarB     Date         Diff   VarD  VarC
 1         1      2007-04-09    NA     0      0
 1         1      2007-04-10    0      0      0
 1         1      2007-04-11   -2      1      1 
 1         1      2007-04-12    0      1      1  
 1         1      2007-04-13    2      0      0  
 1         1      2007-04-14    0      0      0  
 1         1      2007-04-15   -2      1      1  
 1         1      2007-04-16    1      0      0  
 1         1      2007-04-17   -4      1      0  
 1         1      2007-04-18    0      1      0  
 1         1      2007-04-19    0      1      0  
 1         1      2007-04-20    0      1      0  "
db <- read.table(text = Lines, header = TRUE)

这篇关于具有日期的数据框子集的Ifelse语句的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆