删除任何重复的行 [英] remove any rows with duplicates

查看:92
本文介绍了删除任何重复的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个数据框架(以下简称为df).我试图基于给定列(df $ car)删除给定数据框中的所有重复项.

Suppose I have a data frame (lets call it df) that looks like this (below). I am trying to remove ALL duplicates in a given data frame based on a given column (df$car).

options(stringsAsFactors=F)
car <- c('car1', 'car2', 'car2', 'car3', 'car4', 'car4', 'car4', 'car5', 'car6', 'car6')
location <- c(111,345,345,123,678,678,678,432,232,232)
value <- c(1,1,1,1,2,2,2,2,4,4)
a <- c('AT','ATC','TAT','C','TT','TGGGG','GGC','CC','AA','AT')
b <- c('A', 'TAG','TAG','G','AA','AA','AA','GG','TT','TT')

df <- data.frame(car,location,value,a,b)


> df
    car    location value   a    b
 1  car1      111     1    AT    A
 2  car2      345     1   ATC  TAG
 3  car2      345     1   TAT  TAG
 4  car3      123     1     C    G
 5  car4      678     2    TT   AA
 6  car4      678     2 TGGGG   AA
 7  car4      678     2   GGC   AA
 8  car5      432     2    CC   GG
 9  car6      232     4    AA   TT
 10 car6      232     4    AT   TT

我想要的输出如下.我希望删除所有重复的列,而不仅仅是唯一值.

My desired output is the following. I wish to remove ALL columns that have duplicates, not just the unique values.

    car    location value   a    b
 1  car1      111     1    AT    A
 4  car3      123     1     C    G
 8  car5      432     2    CC   GG

请注意::我认为这是与过去发布的其他问题不同的问题.大多数问题都要求基于给定列的唯一行,但我要甚至删除那些行.如果这是重复的帖子,我很乐意关闭此帖子-我只是还没有找到我想要的东西!谢谢你的帮助!

Please note: I believe this is a different question than others that have posted in the past. Most questions are asking for the unique rows based on a given column, but I'm asking that even those rows be removed. If this is a duplicate post, I'm happy to close this one - I just haven't found what I'm looking for yet! Thanks for your help!

推荐答案

可以尝试一下吗?

  df[!(duplicated(df$car) | duplicated(df$car, fromLast = TRUE)), ]

这篇关于删除任何重复的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆