根据列中的值为每个组选择前N行 [英] Selecting top N rows for each group based on value in column

查看：101 发布时间：2020/5/4 4:39:27 r loops dataframe dplyr top-n

本文介绍了根据列中的值为每个组选择前N行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有如下数据框:-

x<-c(3,2,1,8,7,11,10,9,7,5,4)
y<-c("a","a","a", "b","b","c","c","c","c","c","c")
z<-c(2,2,2,1,1,3,3,3,3,3,3)
df<-data.frame(x,y,z)

df
    x y z
1   3 a 2
2   2 a 2
3   1 a 2
4   8 b 1
5   7 b 1
6  11 c 3
7  10 c 3
8   9 c 3
9   7 c 3
10  5 c 3
11  4 c 3

我想按列y为每个组选择前n行，其中在列z中提供了n. 所以输出应该像这样:

I want to select top n row for each group by column y where n is provided in column z. So the output should be like :

推荐答案

基于R的解决方案:

# df is split according to y, then we keep only the top "z" value (after ordering x) 
# and rbind everything back together:
do.call(rbind, 
        lapply(split(df, df$y), 
               function(df1) df1[order(df1$x, decreasing=TRUE), ][1:unique(df1$z), ]))
#     x y z
#a.1  3 a 2
#a.2  2 a 2
#b    8 b 1
#c.6 11 c 3
#c.7 10 c 3
#c.8  9 c 3

@ mt1022在注释中提供了一种更直接的方法(仍在基本R中):

A much more direct way (still in base R) provided in comment by @mt1022:

df[ave(1:nrow(df), df$y, FUN = seq_along) <= df$z, ]
#   x y z
#1  3 a 2
#2  2 a 2
#4  8 b 1
#6 11 c 3
#7 10 c 3
#8  9 c 3

这篇关于根据列中的值为每个组选择前N行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

根据列中的值为每个组选择前N行 [英] Selecting top N rows for each group based on value in column

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

根据列中的值为每个组选择前N行 [英] Selecting top N rows for each group based on value in column

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭