[ pandas ]如何在每个组中获取前n％个记录 [英] [Pandas]how to get top-n% records within each group

查看：72 发布时间：2020/5/24 2:01:33 python pandas

本文介绍了[ pandas ]如何在每个组中获取前n％个记录的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是我的dataFrame

This is my dataFrame

df = pd.DataFrame([['@1','A',40],['@2','A',60],['@3','A',47],['@4','B',33],['@5','B',69],['@6','B',22],['@7','B',90],['@8
','C',31],['@9','C',78],['@10','C',12],['@11','C',89],['@12','C',88],['@13','C',99]],columns=['id','channel','score'])

     id channel  score
0    @1       A     40
1    @2       A     60
2    @3       A     47
3    @4       B     33
4    @5       B     69
5    @6       B     22
6    @7       B     90
7    @8       C     31
8    @9       C     78
9   @10       C     12
10  @11       C     89
11  @12       C     88
12  @13       C     99

每个渠道都有自己的总数，我将百分比设置为80％

Each channel has its own total number , I set a percent number = 80%

我想将int(channel'num * 0.8)设为最大，所以它将是

and I want to take int(channel'num * 0.8) nlargest , so it's will be

A channel take int(3*0.8) = 2
B channel take int(4*0.8) = 3
C channel take int(6*0.8) = 4

     id channel  score
1    @2       A     60
2    @3       A     47
3    @4       B     33
4    @5       B     69
6    @7       B     90
8    @9       C     78
10  @11       C     89
11  @12       C     88
12  @13       C     99

我该怎么办，谢谢.

推荐答案

使用 groupby 与 nlargest :

a = 0.8

df1 = (df.groupby('channel',group_keys=False)
        .apply(lambda x: x.nlargest(int(len(x) * a), 'score')))
print (df1)     
     id channel  score
1    @2       A     60
2    @3       A     47
6    @7       B     90
4    @5       B     69
3    @4       B     33
12  @13       C     99
10  @11       C     89
11  @12       C     88
8    @9       C     78

使用 sort_values 的另一种解决方案+ groupby + head :

df1 = (df.sort_values('score', ascending=False)
        .groupby('channel',group_keys=False)
        .apply(lambda x: x.head(int(len(x) * a)))
        .reset_index(drop=True))

print (df1)
    id channel  score
0   @2       A     60
1   @3       A     47
2   @7       B     90
3   @5       B     69
4   @4       B     33
5  @13       C     99
6  @11       C     89
7  @12       C     88
8   @9       C     78

这篇关于[ pandas ]如何在每个组中获取前n％个记录的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

[ pandas ]如何在每个组中获取前n％个记录 [英] [Pandas]how to get top-n% records within each group

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

[ pandas ]如何在每个组中获取前n％个记录 [英] [Pandas]how to get top-n% records within each group

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭