从R中的数据帧中删除重复的列组合 [英] Remove duplicates column combinations from a dataframe in R

查看:251
本文介绍了从R中的数据帧中删除重复的列组合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从以下数据中删除sessionid,qf和qn的重复组合

I want to remove duplicate combinations of sessionid, qf and qn from the following data

               sessionid             qf        qn         city
1  9cf571c8faa67cad2aa9ff41f3a26e38     cat   biddix          fresno
2  e30f853d4e54604fd62858badb68113a   caleb     amos                
3  2ad41134cc285bcc06892fd68a471cd7  daniel  folkers                
4  2ad41134cc285bcc06892fd68a471cd7  daniel  folkers                
5  63a5e839510a647c1ff3b8aed684c2a5 charles   pierce           flint
6  691df47f2df12f14f000f9a17d1cc40e       j    franz prescott+valley
7  691df47f2df12f14f000f9a17d1cc40e       j    franz prescott+valley
8  b3a1476aa37ae4b799495256324a8d3d  carrie mascorro            brea
9  bd9f1404b313415e7e7b8769376d2705    fred  morales       las+vegas
10 b50a610292803dc302f24ae507ea853a  aurora      lee                
11 fb74940e6feb0dc61a1b4d09fcbbcb37  andrew    price       yorkville 

我以data.frame的形式读取数据,并将其称为mydata。 Heree是我迄今为止的代码,但是我需要知道如何首先对data.frame进行正确排序。其次,删除sessionid,qf和qn的重复组合。最后在列qf中的直方图字符中绘制图形

I read in the data as a data.frame and call it mydata. Heree is the code I have so far, but I need to know how to first sort the data.frame correctly. Secondly remove the duplicate combinations of sessionid, qf, and qn. And lastly graph in a histogram characters in the column qf

sortDATA<-function(name)
{
#sort the code by session Id, first name, then last name
sort1.name <- name[order("sessionid","qf","qn") , ]
#create a vector of length of first names
sname<-nchar(sort1.name$qf)
hist(sname)
}

谢谢!

推荐答案

duplicateated() / code>有一个方法,用于 data.frame s,这是为这种任务设计的:

duplicated() has a method for data.frames, which is designed for just this sort of task:

df <- data.frame(a = c(1:4, 1:4), 
                 b = c(4:1, 4:1), 
                 d = LETTERS[1:8])

df[!duplicated(df[c("a", "b")]),]
#   a b d
# 1 1 4 A
# 2 2 3 B
# 3 3 2 C
# 4 4 1 D

这篇关于从R中的数据帧中删除重复的列组合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆