查找重复项，比较条件，擦除一行r [英] find duplicate, compare a condition, erase one row r

查看：120 发布时间：2020/8/1 19:56:01 r if-statement duplicates

本文介绍了查找重复项，比较条件，擦除一行r的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

使用以下可重现的示例:

Using the following reproducible example:

ID1<-c("a1","a4","a6","a6","a5", "a1" )
ID2<-c("b8","b99","b5","b5","b2","b8" )
Value1<-c(2,5,6,6,2,7)
Value2<- c(23,51,63,64,23,23)
Year<- c(2004,2004,2004,2004,2005,2004)
df<-data.frame(ID1,ID2,Value1,Value2,Year)

我想选择ID1和ID2与Year在其各自列中具有相同值的行.对于此行，我想比较重复行中的Value1和Value2，如果值不相同，则用较小的值擦除行.

I want to select rows where ID1 and ID2 and Year have the same value in their respective columns. For this rows I want to compare Value1 and Value2 in the duplicates rows and IF the values are not the same erase the row with the smaller value.

预期结果:

  ID1 ID2 Value1 Value2 Year         new

2  a4 b99      5     51 2004 a4_b99_2004

4  a6  b5      6     64 2004  a6_b5_2004
5  a5  b2      2     23 2005  a5_b2_2005
6  a1  b8      7     23 2004  a1_b8_2004

我尝试了以下操作: 查找我感兴趣的条件的唯一标识符

I tried the following: Find a unique identifier for the conditions I am interested

df$new<-paste(df$ID1,df$ID2, df$Year, sep="_")

我可以使用唯一标识符来查找包含重复项的数据库行

I can use the unique identifier to find the rows of the database that contain the duplicates

IND<-which(duplicated(df$new) | duplicated(df$new, fromLast = TRUE))

在for循环中，如果唯一标识符重复，则比较这些值并擦除行，但是循环太复杂了，我无法解决.

In a for loop if unique identifier has duplicate compare the values and erase the rows, but the loop is too complicated and I cannot solve it.

for (i in df$new) {

  if(sum(df$new == i)>1)
           {
  ind<-which(df$new==i)
  m= min(df$Value1[ind])
  df<-df[-which.min(df$Value1[ind]),]
  m= min(df$Value2[ind])
  df<-df[-which.min(df$Value2[ind]),]

  }
}

查找重复项，比较条件，擦除一行r [英] find duplicate, compare a condition, erase one row r

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

查找重复项，比较条件，擦除一行r [英] find duplicate, compare a condition, erase one row r

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭