基于值向量对数据框中的行进行子集 [英] Subset rows in a data frame based on a vector of values

查看:65
本文介绍了基于值向量对数据框中的行进行子集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数据集,它们应该具有相同的大小,但事实并非如此.我需要修剪 A 中不在 B 中的值,反之亦然,以消除进入报告的图形中的噪声.(别担心,这些数据不会被永久删除!)

I have two data sets that are supposed to be the same size but aren't. I need to trim the values from A that are not in B and vice versa in order to eliminate noise from a graph that's going into a report. (Don't worry, this data isn't being permanently deleted!)

我已阅读以下内容:

但我仍然无法让它正常工作.这是我的代码:

But I'm still not able to get this to work right. Here's my code:

bg2011missingFromBeg <- setdiff(x=eg2011$ID, y=bg2011$ID)
#attempt 1
eg2011cleaned <- subset(eg2011, ID != bg2011missingFromBeg)
#attempt 2
eg2011cleaned <- eg2011[!eg2011$ID %in% bg2011missingFromBeg]

第一次尝试只是消除结果 setdiff 向量中的第一个值.第二次尝试产生和笨拙的错误:

The first try just eliminates the first value in the resulting setdiff vector. The second try yields and unwieldy error:

Error in `[.data.frame`(eg2012, !eg2012$ID %in% bg2012missingFromBeg) 
:  undefined columns selected

推荐答案

这会给你想要的:

eg2011cleaned <- eg2011[!eg2011$ID %in% bg2011missingFromBeg, ]

您第二次尝试的错误是因为您忘记了 ,

The error in your second attempt is because you forgot the ,

一般来说,为了方便起见,规范object[index] 对二维object 的列进行了子集.如果要对行进行子集化并保留所有列,则必须使用规范object[index_rows, index_columns],而index_cols可以留空,默认使用所有列.

In general, for convenience, the specification object[index] subsets columns for a 2d object. If you want to subset rows and keep all columns you have to use the specification object[index_rows, index_columns], while index_cols can be left blank, which will use all columns by default.

但是,您仍然需要包含 , 以指示您想要获取行的子集而不是列的子集.

However, you still need to include the , to indicate that you want to get a subset of rows instead of a subset of columns.

这篇关于基于值向量对数据框中的行进行子集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆