基于存储在独立有序向量中的值对对数据帧进行子集 [英] Subset a data frame based on value pairs stored in independent ordered vectors
问题描述
我有一个 R 数据框,我需要从中提取数据子集.子集将基于数据框中的两列.例如:
I have an R dataframe that I need to subset data from. The subsetting will be based on two columns in the dataframe. For example:
A <- c(1,2,3,3,5,1)
B <- c(6,7,8,9,8,8)
Value <- c(9,5,2,1,2,2)
DATA <- data.frame(A,B,Value)
这是 DATA 的样子
This is how DATA looks
A B Value
1 6 9
2 7 5
3 8 2
3 9 1
5 8 2
1 8 2
我想要那些 (A,B) 组合为 (1,6) 和 (3,8) 的数据行.这些对存储为 A 和 B 的单独(有序)向量:
I want those rows of data for which (A,B) combination is (1,6) and (3,8). These pairs are stored as individual (ordered) vectors of A and B:
AList <- c(1,3)
BList <- c(6,8)
现在,我试图通过比较 AList 中是否存在 A 列来对数据进行子集化AND B 列存在于 BList 中
Now, I am trying to subset the data basically by comparing if A column is present in AList AND B column is present in BList
DATA[(DATA$A %in% AList & DATA$B %in% BList),]
子集化结果如下所示.除了值对 (1,6) 和 (3,8) 我还得到 (1,8).基本上,这个过滤器为我提供了 AList 和 BList 中所有组合的值对.我如何将其限制为 (1,6) 和 (3,8)?
The subsetted result is shown below. In addition to the value pairs (1,6) and (3,8) I am also getting (1,8). Basically, this filter has given me value pairs for all combinations in AList and BList. How do I restrict it to just (1,6) and (3,8)?
A B Value
1 6 9
3 8 2
1 8 2
这是我想要的结果:
A B Value
1 6 9
3 8 2
推荐答案
你可以试试 match
一个合适的 nomatch
参数:
You could try match
which an appropriated nomatch
argument:
sub <- match(DATA$A, AList, nomatch=-1) == match(DATA$B, BList, nomatch=-2)
sub
# [1] TRUE FALSE TRUE FALSE FALSE FALSE
DATA[sub,]
# A B Value
#1 1 6 9
#3 3 8 2
基于 paste
的方法也是可能的:
A paste
based approach would also be possible:
sub <- paste(DATA$A, DATA$B, sep=":") %in% paste(AList, BList, sep=":")
sub
# [1] TRUE FALSE TRUE FALSE FALSE FALSE
DATA[sub,]
# A B Value
#1 1 6 9
#3 3 8 2
这篇关于基于存储在独立有序向量中的值对对数据帧进行子集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!