基于值向量对数据框中的行进行子集 [英] Subset rows in a data frame based on a vector of values
问题描述
我有两个数据集,它们应该具有相同的大小,但事实并非如此.我需要修剪 A 中不在 B 中的值,反之亦然,以消除进入报告的图形中的噪声.(别担心,这些数据不会被永久删除!)
I have two data sets that are supposed to be the same size but aren't. I need to trim the values from A that are not in B and vice versa in order to eliminate noise from a graph that's going into a report. (Don't worry, this data isn't being permanently deleted!)
我已阅读以下内容:
但我仍然无法让它正常工作.这是我的代码:
But I'm still not able to get this to work right. Here's my code:
bg2011missingFromBeg <- setdiff(x=eg2011$ID, y=bg2011$ID)
#attempt 1
eg2011cleaned <- subset(eg2011, ID != bg2011missingFromBeg)
#attempt 2
eg2011cleaned <- eg2011[!eg2011$ID %in% bg2011missingFromBeg]
第一次尝试只是消除结果 setdiff 向量中的第一个值.第二次尝试产生和笨拙的错误:
The first try just eliminates the first value in the resulting setdiff vector. The second try yields and unwieldy error:
Error in `[.data.frame`(eg2012, !eg2012$ID %in% bg2012missingFromBeg)
: undefined columns selected
推荐答案
这会给你想要的:
eg2011cleaned <- eg2011[!eg2011$ID %in% bg2011missingFromBeg, ]
您第二次尝试的错误是因为您忘记了 ,
The error in your second attempt is because you forgot the ,
一般来说,为了方便起见,规范object[index]
对二维object
的列进行了子集.如果要对行进行子集化并保留所有列,则必须使用规范object[index_rows, index_columns]
,而index_cols
可以留空,默认使用所有列.
In general, for convenience, the specification object[index]
subsets columns for a 2d object
. If you want to subset rows and keep all columns you have to use the specification
object[index_rows, index_columns]
, while index_cols
can be left blank, which will use all columns by default.
但是,您仍然需要包含 ,
以指示您想要获取行的子集而不是列的子集.
However, you still need to include the ,
to indicate that you want to get a subset of rows instead of a subset of columns.
这篇关于基于值向量对数据框中的行进行子集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!