使用plyr :: ddply按组返回列的最大值/最小值的行 [英] return rows with max/min value of column, by group, using plyr::ddply

查看:101
本文介绍了使用plyr :: ddply按组返回列的最大值/最小值的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我找到了已删除)) /24070714/提取对应于组变量的最小值的行/24073057#24073057>这个问题,我很好奇为什么它不起作用.

I found an answer (now deleted) to this question, and I'm curious why it doesn't work.

问题是:按组返回与最小值对应的行.

Question is: return the row corresponding to the minimum value, by group.

例如,给定数据集:

df <- data.frame(State = c(rep('AK',4),rep('RI',4)),
                   Company = LETTERS[1:8],
                   Employees = c(82L, 104L, 37L, 24L, 19L, 118L, 88L, 42L)) 

...正确的答案是:

...the correct answer is:

    State Company Employees
 1:    AK       D        24
 2:    RI       E        19

例如可以通过

library(data.table); setDT(df)[ , .SD[which.min(Employees)], by = State]

我的问题是为什么此plyr::ddply命令起作用:

My question is why this plyr::ddply command doesn't work:

library(plyr)
ddply(df, .(State), summarise, Employees=min(Employees), 
      Company=Company[which.min(Employees)])
# returns:
#   State Employees Company
# 1    AK        24       A
# 2    RI        19       E

换句话说,为什么which.min(Employees)为每个组而不是c(4,1)返回1?请注意,在ddply之外,此方法有效:

In other words, why is which.min(Employees) returning 1 for each group, instead of c(4,1)? Note that outside of ddply, this works:

summarise(df, minEmp = min(Employees), whichMin = which.min(Employees))
#   minEmp whichMin
# 1     19        5

我不太使用plyr,但是如果有合理的方法,我想知道正确的方法.

I don't use plyr much, but I'd like to know the right way to do it, if there's a reasonable one.

推荐答案

我得到了正确的答案.不确定您的情况.

i'm getting the correct answer. not sure about your case..

library(plyr)
ddply(df, .(State), function(x) x[which.min(x$Employees),])
  State Company Employees
1    AK       D        24
2    RI       E        19

这篇关于使用plyr :: ddply按组返回列的最大值/最小值的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆