删除数据帧中R中另一个数据帧中不存在的行 [英] Deleting rows from a data frame that are not present in another data frame in R

查看:90
本文介绍了删除数据帧中R中另一个数据帧中不存在的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是R的新手,但从我一直在阅读的内容来看,这对我来说有点难.我有两个数据框,例如DF1和DF2,它们都有一个感兴趣的变量,例如idFriends,我想创建一个新的数据框,其中所有DF2中未出现的行都基于这些值从DF1中删除idFriends.

I'm new to R but from what I've been reading this one is a bit hard for me. I have two data frames, say DF1 and DF2, both of which have a variable of interest, say idFriends, and I want to create a new data frame where all the rows that do not appear in DF2 are deleted from DF1 based on the values of idFriends.

问题是,在DF2中,每个值仅出现一次,而DF1有成千上万个值,其中许多重复.但是我不希望R删除重复项,我只希望它搜索DF2,看看DF2中是否存在DF1的EACH值,如果不存在,请删除该行,如果存在则将其保留原样,然后执行DF1中的每一行都相同.

The thing is that in DF2 each value appears only once while DF1 has thousands of values, many of them repeated. BUT I don't want R to delete repetitions, I just want it to search DF2, see if EACH value of DF1 exists in DF2, and if it doesn't exist delete that row and if it exists leave it as is, and do the same for each row in DF1.

我希望这很清楚.

推荐答案

dplyr具有执行该操作的semi_join函数.

dplyr has an semi_join function that does that.

DF1 %>% semi_join(DF2, by = "idFriends") # keep rows with matching ID
DF1 %>% anti_join(DF2, by = "idFriends") # keep rows without matching ID

这篇关于删除数据帧中R中另一个数据帧中不存在的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆