基于最小值的子集数据 [英] Subset data based on Minimum Value

查看：124 发布时间：2017/7/13 22:15:58 r subset dplyr plyr

本文介绍了基于最小值的子集数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这可能很容易。以下是数据：

  dat < -  read.table（header = TRUE，text =
 Seg ID Distance 
 Seg46 V21 160.37672 
 Seg72 V85 191.24400 
 Seg373 V85 167.38930 
 Seg159 V147 14.74852 
 Seg233 V171 193.01636 
 Seg234 V171 200.21458 
 
 ）
 dat 
 Seg ID距离
 Seg46 V21 160.37672 
 Seg72 V85 191.24400 
 Seg373 V85 167.38930 
 Seg159 V147 14.74852 
 Seg233 V171 193.01636 
 Seg234 V171 200.21458

我打算收到如下表格， code> Seg 为最小距离（因为重复在 ID 中看到。

  Seg Crash_ID距离
 Seg46 V21 160.37672 
 Seg373 V85 167.38930 
 Seg159 V147 14.74852 
 Seg233 V171 193.01636

我试图使用 ddply 来解决它;但它是没有到达那里。

ddply（dat，Seg，总结，min = min（距离）） Seg最低 Seg159 14.74852 Seg233 193.01636 Seg234 200.21458 Seg373 167.38930 Seg46 160.37672 Seg72 191.24400 pre>

解决方案

我们可以使用 which.min 对行进行子集。在使用ID分组之后，根据最小Distance的位置，我们切片。

  library（dplyr）
 dat％>％
 group_by（ID）％>％
 slice（which.min（Distance））

使用 data.table的类似选项将是

 库（data.table）
 setDT（dat）[，.SD [which.min（Distance）]，by = ID]

This might an easy one. Here's the data:

dat <- read.table(header=TRUE, text="
Seg  ID  Distance
Seg46      V21 160.37672
Seg72      V85 191.24400
Seg373      V85 167.38930
Seg159     V147  14.74852
Seg233     V171 193.01636
Seg234     V171 200.21458

                   ")
dat
Seg  ID  Distance
Seg46      V21 160.37672
Seg72      V85 191.24400
Seg373      V85 167.38930
Seg159     V147  14.74852
Seg233     V171 193.01636
Seg234     V171 200.21458

I am intending to get a table like the following that will give me Seg for the minimized distance (as duplication is seen in ID.

Seg Crash_ID  Distance
Seg46      V21 160.37672
Seg373      V85 167.38930
Seg159     V147  14.74852
Seg233     V171 193.01636

I am trying to use ddply to solve it; but it is not reaching there.

ddply(dat, "Seg", summarize, min = min(Distance))
Seg       min
Seg159  14.74852
Seg233 193.01636
Seg234 200.21458
Seg373 167.38930
Seg46 160.37672
Seg72 191.24400

解决方案

We can subset the rows with which.min. After grouping with 'ID', we slice the rows based on the position of minimum 'Distance'.

library(dplyr)
dat %>% 
   group_by(ID) %>% 
   slice(which.min(Distance))

A similar option using data.table would be

library(data.table)
setDT(dat)[, .SD[which.min(Distance)], by = ID]

这篇关于基于最小值的子集数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

基于最小值的子集数据 [英] Subset data based on Minimum Value

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

基于最小值的子集数据 [英] Subset data based on Minimum Value

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭