按组选择最大行值 [英] select maximum row value by group

查看:71
本文介绍了按组选择最大行值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试通过查看其他帖子来处理我的数据,但是我一直遇到错误.我的数据new看起来像这样:

I've been trying to do this with my data by looking at other posts, but I keep getting an error. My data new looks like this:

id  year    name    gdp
1   1980    Jamie   45
1   1981    Jamie   60
1   1982    Jamie   70
2   1990    Kate    40
2   1991    Kate    25
2   1992    Kate    67
3   1994    Joe     35
3   1995    Joe     78
3   1996    Joe     90

我想按ID选择年份值最高的行.所以所需的输出是:

I want to select the row with the highest year value by id. So the wanted output is:

id  year    name    gdp
1   1982    Jamie   70
2   1992    Kate    67
3   1996    Joe     90

选择包含每日最大值R的行我尝试了以下操作,但没有成功

From Selecting Rows which contain daily max value in R I tried the following but did not work

ddply(new,~id,function(x){x[which.max(new$year),]})

我也尝试过

tapply(new$year, new$id, max)

但这并没有给我想要的输出.

But this didn't give me the wanted output.

任何建议都会有帮助!

推荐答案

只需使用split:

df <- do.call(rbind, lapply(split(df, df$id),
  function(subdf) subdf[which.max(subdf$year)[1], ]))

例如,

df <- data.frame(id = rep(1:10, each = 3), year = round(runif(30,0,10)) + 1980, gdp = round(runif(30, 40, 70)))
print(head(df))
#   id year gdp
# 1  1 1990  49
# 2  1 1981  47
# 3  1 1987  69
# 4  2 1985  57
# 5  2 1989  41
# 6  2 1988  54

df <- do.call(rbind, lapply(split(df, df$id), function(subdf) subdf[which.max(subdf$year)[1], ]))
print(head(df))
#    id year gdp
# 1   1 1990  49
# 2   2 1989  41
# 3   3 1989  55
# 4   4 1988  62
# 5   5 1989  48
# 6   6 1990  41

这篇关于按组选择最大行值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆