pandas ,按计数分组并将计数添加到原始数据框? [英] Pandas, group by count and add count to original dataframe?
本文介绍了 pandas ,按计数分组并将计数添加到原始数据框?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
当尝试对数据框中具有相似种类"的行进行计数时:
When trying to count rows with similar 'kind' in data frame:
import pandas as pd
items = [('aaa','aaa text 1'), ('aaa','aaa text 2'), ('aaa','aaa text 3'),
('bb', 'bb text 1'), ('bb', 'bb text 2'), ('bb', 'bb text 3'),
('bb', 'bb text 4'),
('cccc','cccc text 1'), ('cccc','cccc text 2'),
('dd', 'dd text 1'),
('e', 'e text 1'),
('fff', 'fff text 1'),
]
df = pd.DataFrame(items, columns=['kind', 'msg'])
df
kind msg
0 aaa aaa text 1
1 aaa aaa text 2
2 aaa aaa text 3
3 bb bb text 1
4 bb bb text 2
5 bb bb text 3
6 bb bb text 4
7 cccc cccc text 1
8 cccc cccc text 2
9 dd dd text 1
10 e e text 1
11 fff fff text 1
此代码有效:
df = df[['kind']].groupby(['kind'])['kind'] \
.count() \
.reset_index(name='count') \
.sort_values(['count'], ascending=False) \
.head(5)
df
结果:
kind count
0 aaa 1
1 bb 1
2 cccc 1
3 dd 1
4 e 1
但是,如何才能像原始一加'count'列那样获得一个包含所有列的数据框?因此,结果应按此顺序具有种类",味精",计数"列?
Yet, how can one get a data frame with all columns as in original one plus 'count' column? So the result should have columns 'kind', 'msg', 'count' in this order?
另外,如何按计数的降序对生成的数据帧进行排序?
Also, how to sort this resulting data frame in descending order of count?
推荐答案
IIUC
In [247]: df['count'] = df.groupby('kind').transform('count')
In [248]: df
Out[248]:
kind msg count
0 aaa aaa text 1 3
1 aaa aaa text 2 3
2 aaa aaa text 3 3
3 bb bb text 1 4
4 bb bb text 2 4
5 bb bb text 3 4
6 bb bb text 4 4
7 cccc cccc text 1 2
8 cccc cccc text 2 2
9 dd dd text 1 1
10 e e text 1 1
11 fff fff text 1 1
排序:
In [249]: df.sort_values('count', ascending=False)
Out[249]:
kind msg count
3 bb bb text 1 4
4 bb bb text 2 4
5 bb bb text 3 4
6 bb bb text 4 4
0 aaa aaa text 1 3
1 aaa aaa text 2 3
2 aaa aaa text 3 3
7 cccc cccc text 1 2
8 cccc cccc text 2 2
9 dd dd text 1 1
10 e e text 1 1
11 fff fff text 1 1
这篇关于 pandas ,按计数分组并将计数添加到原始数据框?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文