R,有条件地删除重复的行 [英] R, conditionally remove duplicate rows
问题描述
我想要删除ID.A被复制的行,但条件是距离值,使得我留下每个ID的最小距离值
希望这样有道理吗?
非常感谢提前
编辑
希望一个例子比我的文本更有用。在这里,我想删除ID.A = 3的第二和第三行:
myDF< - read.table text =ID.A ID.B DISTANCE
1 3 1
2 6 8
3 2 0.4
3 3 1
3 8 5
4 8 7
5 2 11,header = TRUE)
您也可以在基数R中轻松进行。如果 dat
是您的数据框,
do.call(rbind,
by(dat,INDICES = list(dat $ ID.A),
FUN = function(x)head(x [order(x $ DISTANCE),],1)))
I have a dataframe in R containing the columns ID.A, ID.B and DISTANCE, where distance represents the distance between ID.A and ID.B. For each value (1->n) of ID.A, there may be multiple values of ID.B and DISTANCE (i.e. there may be multiple duplicate rows in ID.A e.g. all of value 4 which each has a different ID.B and distance in that row).
I would like to be able to remove rows where ID.A is duplicated, but conditional upon the distance value such that I am left with the smallest distance values for each ID.A record.
Hopefully that makes sense?
Many thanks in advance
EDIT
Hopefully an example will prove more useful than my text. Here I would like to remove the second and third rows where ID.A = 3:
myDF <- read.table(text="ID.A ID.B DISTANCE
1 3 1
2 6 8
3 2 0.4
3 3 1
3 8 5
4 8 7
5 2 11", header = TRUE)
You can also do it easily in base R. If dat
is your dataframe,
do.call(rbind,
by(dat, INDICES=list(dat$ID.A),
FUN=function(x) head(x[order(x$DISTANCE), ], 1)))
这篇关于R,有条件地删除重复的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!