删除重复的 2 列排列 [英] Remove duplicated 2 columns permutations
问题描述
我找不到这个问题的好标题,所以请随时编辑它.
I can't find a good title for this question so feel free to edit it please.
我有这个data.frame
I have this data.frame
section time to from
1 a 9 1 2
2 a 9 2 1
3 a 12 2 3
4 a 12 2 4
5 a 12 3 2
6 a 12 3 4
7 a 12 4 2
8 a 12 4 3
我想同时删除具有相同 to
和 from
的重复行,而不计算 2 列的排列:例如 (1,2) 和 (2,1) 重复.
I want to remove duplicated rows that have the same to
and from
simultaneously, without computing permutations of the 2 columns: e.g (1,2) and (2,1) are duplicated.
所以最终输出是:
section time to from
1 a 9 1 2
3 a 12 2 3
4 a 12 2 4
6 a 12 3 4
我有一个解决方案,方法是构造一个新的列键,例如
I have a solution by constructing a new column key e.g
key <- paste(min(to,from),max(to,from))
并使用 duplicated
删除重复的密钥,但我认为这是肮脏的解决方案.
and remove duplicated key using duplicated
, but I think this is dirty solution.
这里是我的数据输入
structure(list(section = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L), .Label = "a", class = "factor"), time = c(9L, 9L, 12L,
12L, 12L, 12L, 12L, 12L), to = c(1L, 2L, 2L, 2L, 3L, 3L, 4L,
4L), from = c(2L, 1L, 3L, 4L, 2L, 4L, 2L, 3L)), .Names = c("section",
"time", "to", "from"), row.names = c(NA, -8L), class = "data.frame")
推荐答案
mn <- pmin(s$to, s$from)
mx <- pmax(s$to, s$from)
int <- as.numeric(interaction(mn, mx))
s[match(unique(int), int),]
section time to from
1 a 9 1 2
3 a 12 2 3
4 a 12 2 4
6 a 12 3 4
这个想法的功劳归于这个问题:从数据帧中删除连续重复项和特别是@MatthewPlourde 的回答.
Credit for the idea goes to this question: Remove consecutive duplicates from dataframe and specifically @MatthewPlourde's answer.
这篇关于删除重复的 2 列排列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!