R:如何通过具有给定容错的实际值列进行聚合 [英] R: how to aggregate by real values column with given error tolerance

查看:154
本文介绍了R:如何通过具有给定容错的实际值列进行聚合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个数据框:

  t < -  data.frame(d1 = c(694,695, 
d2 = c(1.80950881,1.80951007,1.80951052,1.46499982,1.46500087,1.14381419,1.14381319))

d1 d2
1 694 1.809509
2 695 1.809510
3 696 1.809511
4 2243 1.465000
5 2244 1.465001
6 2651 1.143814
7 2652 1.143813

我想按 d2 列进行分组,接近但不完全相等的值。因此,在这个例子中,汇总后,我想获得以下数据集:

  d1 d2 
1 694 1.809509
2 2243 1.465000
3 2652 1.143813

至少 d2 来自每个群组的价值。



使用集合函数,我的第一次尝试:

  aggregate(t,by = list(t $ d2),FUN = min)
Group.1 d1 d2
1 1.143813 2652 1.143813
2 1.143814 2651 1.143814 $ b $ 3 1.465000 2243 1.465000 $ b $ 4 4 1.465001 2244 1.465001
5 1.809509 694 1.809509
6 1.809510 695 1.809510
7 1.809511 696 1.809511

远远达不到我的目标。 / p>

如何判断聚合不是由完全相等组成的,而是通过与提供的容错相等来组合的?

解决方案

这是一种使用tidyverse的方法:

 库(吨(d2,1))%>%#group by round d2 
filter(d2 == min(d2))%>% #filter min d1 in each group
ungroup()%>%#ungroup所以你可以删除分组列
select(-3)


Assuming I have a data frame:

t <- data.frame(d1=c( 694, 695, 696, 2243, 2244, 2651, 2652 ),
                d2=c(1.80950881, 1.80951007, 1.80951052, 1.46499982, 1.46500087, 1.14381419, 1.14381319 ))

    d1       d2
1  694 1.809509
2  695 1.809510
3  696 1.809511
4 2243 1.465000
5 2244 1.465001
6 2651 1.143814
7 2652 1.143813

I'd like to group by the column d2 real values that have very close but not exactly equal values. Thus, in this example, after aggregation, I'd like to obtain the following data set:

    d1       d2
1  694 1.809509
2 2243 1.465000
3 2652 1.143813

taking the row with minimum d2 value from each group.

Using the aggregate function, my first attempt:

aggregate(t, by=list(t$d2), FUN=min)
   Group.1   d1       d2
1 1.143813 2652 1.143813
2 1.143814 2651 1.143814
3 1.465000 2243 1.465000
4 1.465001 2244 1.465001
5 1.809509  694 1.809509
6 1.809510  695 1.809510
7 1.809511  696 1.809511

is far from reaching my goal.

How can I tell aggregate to group not by exact equality, but by equality with provided error tolerance?

解决方案

Here is an approach with tidyverse:

library(tidyverse)
t %>%
  group_by(round(d2, 1)) %>% #group by rounded d2
  filter(d2 == min(d2)) %>% #filter min d1 in each group
  ungroup() %>% #ungroup so you can remove the grouping column
  select(-3)

这篇关于R:如何通过具有给定容错的实际值列进行聚合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆