使用dplyr查找重复的元素 [英] Find duplicated elements with dplyr

查看:274
本文介绍了使用dplyr查找重复的元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试使用提供的代码此处使用dplyr查找所有重复的元素,例如:

I tried using the code presented here to find ALL duplicated elements with dplyr like this:

library(dplyr)

mtcars %>%
mutate(cyl.dup = cyl[duplicated(cyl) | duplicated(cyl, from.last = TRUE)])

如何转换在这里查找dplyr的所有重复元素?我上面的代码只是抛出错误?甚至更好的是,还有另一个函数比卷积 x [duplicated(x)|

How can I convert code presented here to find ALL duplicated elements with dplyr? My code above just throws an error? Or even better, is there another function that will achieve this more succinctly than the convoluted x[duplicated(x) | duplicated(x, from.last = TRUE)]) approach?

推荐答案

我猜想你可能会重复(x,from.last = TRUE)])方法?为此,请使用过滤器

I guess you could use filter for this purpose:

mtcars %>% 
  group_by(carb) %>% 
  filter(n()>1)

小示例(请注意,我添加了 summarize()以证明结果数据集不包含重复的'carb'行。我使用了'carb'而不是' cyl,因为 carb具有唯一值,而 cyl则没有):

Small example (note that I added summarize() to prove that the resulting data set does not contain rows with duplicate 'carb'. I used 'carb' instead of 'cyl' because 'carb' has unique values whereas 'cyl' does not):

mtcars %>% group_by(carb) %>% summarize(n=n())
#Source: local data frame [6 x 2]
#
#  carb  n
#1    1  7
#2    2 10
#3    3  3
#4    4 10
#5    6  1
#6    8  1

mtcars %>% group_by(carb) %>% filter(n()>1) %>% summarize(n=n())
#Source: local data frame [4 x 2]
#
#  carb  n
#1    1  7
#2    2 10
#3    3  3
#4    4 10

这篇关于使用dplyr查找重复的元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆