pandas ,按计数分组并将计数添加到原始数据框? [英] Pandas, group by count and add count to original dataframe?

查看:113
本文介绍了 pandas ,按计数分组并将计数添加到原始数据框?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当尝试对数据框中具有相似种类"的行进行计数时:

When trying to count rows with similar 'kind' in data frame:

import pandas as pd

items = [('aaa','aaa text 1'), ('aaa','aaa text 2'), ('aaa','aaa text 3'),
         ('bb', 'bb text 1'), ('bb', 'bb text 2'), ('bb', 'bb text 3'), 
         ('bb', 'bb text 4'),
         ('cccc','cccc text 1'), ('cccc','cccc text 2'),
         ('dd', 'dd text 1'),
         ('e', 'e text 1'),
         ('fff', 'fff text 1'),
        ]

df = pd.DataFrame(items, columns=['kind', 'msg'])
df

    kind    msg
0   aaa     aaa text 1
1   aaa     aaa text 2
2   aaa     aaa text 3
3   bb      bb text 1
4   bb      bb text 2
5   bb      bb text 3
6   bb      bb text 4
7   cccc    cccc text 1
8   cccc    cccc text 2
9   dd      dd text 1
10  e       e text 1
11  fff     fff text 1

此代码有效:

df = df[['kind']].groupby(['kind'])['kind'] \
                         .count() \
                         .reset_index(name='count') \
                         .sort_values(['count'], ascending=False) \
                         .head(5)

df

结果:

    kind      count
    0   aaa   1
    1   bb    1
    2   cccc  1
    3   dd    1
    4   e     1

但是,如何才能像原始一加'count'列那样获得一个包含所有列的数据框?因此,结果应按此顺序具有种类",味精",计数"列?

Yet, how can one get a data frame with all columns as in original one plus 'count' column? So the result should have columns 'kind', 'msg', 'count' in this order?

另外,如何按计数的降序对生成的数据帧进行排序?

Also, how to sort this resulting data frame in descending order of count?

推荐答案

IIUC

In [247]: df['count'] = df.groupby('kind').transform('count')

In [248]: df
Out[248]:
    kind          msg  count
0    aaa   aaa text 1      3
1    aaa   aaa text 2      3
2    aaa   aaa text 3      3
3     bb    bb text 1      4
4     bb    bb text 2      4
5     bb    bb text 3      4
6     bb    bb text 4      4
7   cccc  cccc text 1      2
8   cccc  cccc text 2      2
9     dd    dd text 1      1
10     e     e text 1      1
11   fff   fff text 1      1

排序:

In [249]: df.sort_values('count', ascending=False)
Out[249]:
    kind          msg  count
3     bb    bb text 1      4
4     bb    bb text 2      4
5     bb    bb text 3      4
6     bb    bb text 4      4
0    aaa   aaa text 1      3
1    aaa   aaa text 2      3
2    aaa   aaa text 3      3
7   cccc  cccc text 1      2
8   cccc  cccc text 2      2
9     dd    dd text 1      1
10     e     e text 1      1
11   fff   fff text 1      1

这篇关于 pandas ,按计数分组并将计数添加到原始数据框?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆