查找所有重复行,包括“具有较小下标的元素"; [英] Finding ALL duplicate rows, including "elements with smaller subscripts"

查看:17
本文介绍了查找所有重复行,包括“具有较小下标的元素";的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

R 的 duplicated 返回一个向量,显示向量或数据框的每个元素是否与具有较小下标的元素重复.因此,如果 5 行数据帧的第 3、4 和 5 行相同,则 duplicated 会给我向量

R's duplicated returns a vector showing whether each element of a vector or data frame is a duplicate of an element with a smaller subscript. So if rows 3, 4, and 5 of a 5-row data frame are the same, duplicated will give me the vector

FALSE, FALSE, FALSE, TRUE, TRUE

但在这种情况下,我实际上想得到

But in this case I actually want to get

FALSE, FALSE, TRUE, TRUE, TRUE

也就是说,我想知道一行是否也被带有更大下标的行复制.

that is, I want to know whether a row is duplicated by a row with a larger subscript too.

推荐答案

duplicated 有一个 fromLast 参数.?duplicated 的示例"部分向您展示了如何使用它.只需调用 duplicated 两次,一次使用 fromLast=FALSE,一次使用 fromLast=TRUE 并获取 TRUE 的行代码>.

duplicated has a fromLast argument. The "Example" section of ?duplicated shows you how to use it. Just call duplicated twice, once with fromLast=FALSE and once with fromLast=TRUE and take the rows where either are TRUE.

一些迟到的您没有提供可重现的示例,因此这是@jbaums 提供的插图

Some late You didn't provide a reproducible example, so here's an illustration kindly contributed by @jbaums

vec <- c("a", "b", "c","c","c") 
vec[duplicated(vec) | duplicated(vec, fromLast=TRUE)]
## [1] "c" "c" "c"

<小时>

以及数据框案例的示例:


And an example for the case of a data frame:

df <- data.frame(rbind(c("a","a"),c("b","b"),c("c","c"),c("c","c")))
df[duplicated(df) | duplicated(df, fromLast=TRUE), ]
##   X1 X2
## 3  c  c
## 4  c  c

这篇关于查找所有重复行,包括“具有较小下标的元素";的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆