R，有条件地删除重复的行 [英] R, conditionally remove duplicate rows

查看：132 发布时间：2017/7/20 23:35:16 r conditional duplicates

本文介绍了R，有条件地删除重复的行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个包含列ID.A，ID.B和DISTANCE的数据帧，其中distance表示ID.A和ID.B之间的距离。对于ID.A的每个值（1-> n），可以存在多个ID.B和DISTANCE的值（例如，ID.A中可以存在多个重复的行，例如，所有的值4都具有不同的ID.B。和该行的距离）。

我想要删除ID.A被复制的行，但条件是距离值，使得我留下每个ID的最小距离值

希望这样有道理吗？

非常感谢提前

编辑

希望一个例子比我的文本更有用。在这里，我想删除ID.A = 3的第二和第三行：

  myDF<  -  read.table text =ID.A ID.B DISTANCE 
 1 3 1 
 2 6 8 
 3 2 0.4 
 3 3 1 
 3 8 5 
 4 8 7 
 5 2 11，header = TRUE）

解决方案>

您也可以在基数R中轻松进行。如果 dat 是您的数据框，

  do.call（rbind，
 by（dat，INDICES = list（dat $ ID.A），
 FUN = function（x）head（x [order（x $ DISTANCE），]，1）））

I have a dataframe in R containing the columns ID.A, ID.B and DISTANCE, where distance represents the distance between ID.A and ID.B. For each value (1->n) of ID.A, there may be multiple values of ID.B and DISTANCE (i.e. there may be multiple duplicate rows in ID.A e.g. all of value 4 which each has a different ID.B and distance in that row).

I would like to be able to remove rows where ID.A is duplicated, but conditional upon the distance value such that I am left with the smallest distance values for each ID.A record.

Hopefully that makes sense?

Many thanks in advance

EDIT

Hopefully an example will prove more useful than my text. Here I would like to remove the second and third rows where ID.A = 3:

myDF <- read.table(text="ID.A ID.B DISTANCE
  1 3 1
  2 6 8
  3 2 0.4
  3 3 1
  3 8 5
  4 8  7
  5 2 11", header = TRUE)

解决方案

You can also do it easily in base R. If dat is your dataframe,

do.call(rbind, 
        by(dat, INDICES=list(dat$ID.A), 
           FUN=function(x) head(x[order(x$DISTANCE), ], 1)))

这篇关于R，有条件地删除重复的行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

R，有条件地删除重复的行 [英] R, conditionally remove duplicate rows

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

R，有条件地删除重复的行 [英] R, conditionally remove duplicate rows

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭