基于最小值的子集数据 [英] Subset data based on Minimum Value

查看:124
本文介绍了基于最小值的子集数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这可能很容易。以下是数据:

  dat < -  read.table(header = TRUE,text =
Seg ID Distance
Seg46 V21 160.37672
Seg72 V85 191.24400
Seg373 V85 167.38930
Seg159 V147 14.74852
Seg233 V171 193.01636
Seg234 V171 200.21458


dat
Seg ID距离
Seg46 V21 160.37672
Seg72 V85 191.24400
Seg373 V85 167.38930
Seg159 V147 14.74852
Seg233 V171 193.01636
Seg234 V171 200.21458

我打算收到如下表格, code> Seg 为最小距离(因为重复在 ID 中看到。

  Seg Crash_ID距离
Seg46 V21 160.37672
Seg373 V85 167.38930
Seg159 V147 14.74852
Seg233 V171 193.01636

我试图使用 ddply 来解决它;但它是没有到达那里。

  ddply(dat,Seg,总结,min = min(距离))
Seg最低
Seg159 14.74852
Seg233 193.01636
Seg234 200.21458
Seg373 167.38930
Seg46 160.37672
Seg72 191.24400
pre>

解决方案

我们可以使用 which.min 对行进行子集。在使用ID分组之后,根据最小Distance的位置,我们切片

  library(dplyr)
dat%>%
group_by(ID)%>%
slice(which.min(Distance))






使用 data.table的类似选项将是

 库(data.table)
setDT(dat)[,.SD [which.min(Distance)],by = ID]


This might an easy one. Here's the data:

dat <- read.table(header=TRUE, text="
Seg  ID  Distance
Seg46      V21 160.37672
Seg72      V85 191.24400
Seg373      V85 167.38930
Seg159     V147  14.74852
Seg233     V171 193.01636
Seg234     V171 200.21458

                   ")
dat
Seg  ID  Distance
Seg46      V21 160.37672
Seg72      V85 191.24400
Seg373      V85 167.38930
Seg159     V147  14.74852
Seg233     V171 193.01636
Seg234     V171 200.21458

I am intending to get a table like the following that will give me Seg for the minimized distance (as duplication is seen in ID.

Seg Crash_ID  Distance
Seg46      V21 160.37672
Seg373      V85 167.38930
Seg159     V147  14.74852
Seg233     V171 193.01636

I am trying to use ddply to solve it; but it is not reaching there.

ddply(dat, "Seg", summarize, min = min(Distance))
Seg       min
Seg159  14.74852
Seg233 193.01636
Seg234 200.21458
Seg373 167.38930
Seg46 160.37672
Seg72 191.24400

解决方案

We can subset the rows with which.min. After grouping with 'ID', we slice the rows based on the position of minimum 'Distance'.

library(dplyr)
dat %>% 
   group_by(ID) %>% 
   slice(which.min(Distance))


A similar option using data.table would be

library(data.table)
setDT(dat)[, .SD[which.min(Distance)], by = ID]

这篇关于基于最小值的子集数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆