R根据不同列中范围内的值添加新列 [英] R add new column depending on values in a range in different columns

查看:100
本文介绍了R根据不同列中范围内的值添加新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带两个布尔列A和B的data.table。我想添加一个新的布尔行C,该行依赖于A和B,但是我很难在上一行和即将出现的行中查找。

I have a data.table with two boolean columns A and B. I'd like to add a new boolean row C that depends on A and B, but I'm having trouble 'looking' in previous and upcomming rows.

我想定义C如下。如果在三行范围内有一行A = 1且至少一个B = 1,那么我希望C在所有其他行上的A = 1和C = 0的行上变为C = 1在范围中。否则C应该是C = B。

I'd like to define C as follows. If there is a row with A=1 and atleast one B=1 in a range of three rows around then I'd like for C to become C=1 on the row where A=1 and C=0 on all the other rows in the range. Else C should be C=B.

在两个范围重叠并且都包含B = 1的情况下,C应该在两行上都变为C = 1,而其他行上A = 1和C = 0 。需要进一步说明:

In the case that two ranges overlap and both contain a B=1 then C should become C=1 on both rows where A=1 and C=0 on the others. For more clarification:

df <- data.table(A=c(0,0,0,1,0,0,0,0,0,0,0,1,1,0,0), 
                 B=c(0,1,0,0,0,1,0,1,1,0,0,0,0,0,1))

    A B                                        A B C
1:  0 0 #                                  1:  0 0 0
2:  0 1 #                                  2:  0 1 0
3:  0 0 #                                  3:  0 0 0
4:  1 0 # range of three                   4:  1 0 1
5:  0 0 #                                  5:  0 0 0
6:  0 1 #                                  6:  0 1 0
7:  0 0 #                                  7:  0 0 0
8:  0 1                                    8:  0 1 1 # C = B
9:  0 1 #                                  9:  0 1 0
10: 0 0 ##                                 10: 0 0 0
11: 0 0 ##                                 11: 0 0 0
12: 1 0 ## overlapping range of three      12: 1 0 1
13: 1 0 ##                                 13: 1 0 1
14: 0 0 ##                                 14: 0 0 0
15: 0 1 ##                                 15: 0 1 0

我该怎么做,对此我一无所知。

How do I go about doing this, I'm kind of clueless on this one.

推荐答案

# Find ranges where A == 1
ind <- lapply(which(df$A == 1)
              , function(i){s <- i + -3:3; s[s %in% seq(nrow(df))]})
# Remove ranges with no B == 1
good <- sapply(ind, function(i) df[i, any(B == 1)])
ind  <- unique(unlist(ind[good]))
# Assign C as described
df[, C := B]
df[ind, C := as.numeric(A == 1)]
df
#     A B C
#  1: 0 0 0
#  2: 0 1 0
#  3: 0 0 0
#  4: 1 0 1
#  5: 0 0 0
#  6: 0 1 0
#  7: 0 0 0
#  8: 0 1 1
#  9: 0 1 0
# 10: 0 0 0
# 11: 0 0 0
# 12: 1 0 1
# 13: 1 0 1
# 14: 0 0 0
# 15: 0 1 0

下面使用的数据。我更改了您的 df 定义以匹配显示的 df

Data used below. I changed your df definition to match the df displayed

df <- data.table(A=c(0,0,0,1,0,0,0,0,0,0,0,0,1,0,0), 
                 B=c(0,1,0,0,0,1,0,1,1,0,0,0,0,0,0))

df[12, A := 1]
df[15, B := 1]

df

#     A B
#  1: 0 0
#  2: 0 1
#  3: 0 0
#  4: 1 0
#  5: 0 0
#  6: 0 1
#  7: 0 0
#  8: 0 1
#  9: 0 1
# 10: 0 0
# 11: 0 0
# 12: 1 0
# 13: 1 0
# 14: 0 0
# 15: 0 1

这篇关于R根据不同列中范围内的值添加新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆