从数据框中删除连续的重复项 [英] Remove consecutive duplicates from dataframe
本文介绍了从数据框中删除连续的重复项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个数据框,我想删除连续的重复项(在基数中).我知道 rle
在这里可能会有所帮助,但想不出如何使用它.示例输出将有助于阐明我的要求.
I have a data frame that I want to remove duplicates that are consecutive (in base). I know rle
may be helpful here but can't think of how to use it. The example output will help to illuminate what I'm asking for.
生成样本数据:
set.seed(12)
samps <- sample(1:5, 20, T)
dat <- data.frame(v1=LETTERS[samps], v2=month.abb[samps])
dat[10, 2] <- "Mar"
示例数据:
v1 v2
1 A Jan
2 E May
3 E May
4 B Feb
5 A Jan
6 A Jan
7 A Jan
8 D Apr
9 A Jan
10 A Mar
11 B Feb
12 E May
13 B Feb
14 B Feb
15 B Feb
16 C Mar
17 C Mar
18 C Mar
19 D Apr
20 A Jan
预期结果:
v1 v2
1 A Jan
3 E May
4 B Feb
7 A Jan
8 D Apr
10 A Mar
11 B Feb
12 E May
15 B Feb
18 C Mar
19 D Apr
20 A Jan
推荐答案
这是一种方法,不是使用 rle
,而是一种方法:
Here's a way, not with rle
, but a way none-the-less:
dat[with(dat, c(TRUE, diff(as.numeric(interaction(v1, v2))) != 0)), ]
这假设您使用 factor
列,正如您的示例数据所暗示的那样.
This assumes you're using factor
columns, as your sample data implies.
这篇关于从数据框中删除连续的重复项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文