如何在DataFrame中的groupby中增加行数 [英] How to increment a row count in groupby in DataFrame
本文介绍了如何在DataFrame中的groupby中增加行数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我需要计算熊猫DataFrame中每种产品的activity_months数.到目前为止,这是我的数据和代码:
I need to calculate the number of activity_months for each product in a pandas DataFrame. Here is my data and code so far:
from pandas import DataFrame
from datetime import datetime
data = [
('product_a','08/31/2013')
,('product_b','08/31/2013')
,('product_c','08/31/2013')
,('product_a','09/30/2013')
,('product_b','09/30/2013')
,('product_c','09/30/2013')
,('product_a','10/31/2013')
,('product_b','10/31/2013')
,('product_c','10/31/2013')
]
product_df = DataFrame( data, columns=['prod_desc','activity_month'])
for index, row in product_df.iterrows():
row['activity_month']= datetime.strptime(row['activity_month'],'%m/%d/%Y')
product_df.loc[index, 'activity_month'] = datetime.strftime(row['activity_month'],'%Y-%m-%d')
product_df = product_df.sort(['prod_desc','activity_month'])
product_df['month_num'] = product_df.groupby(['prod_desc']).size()
但是,这将返回month_num的NaN.
However, this returns NaNs for month_num.
这就是我想要得到的:
prod_desc activity_month month_num
product_a 2014-08-31 1
product_a 2014-09-30 2
product_a 2014-10-31 3
product_b 2014-08-31 1
product_b 2014-09-30 2
product_b 2014-10-31 3
product_c 2014-08-31 1
product_c 2014-09-30 2
product_c 2014-10-31 3
推荐答案
groupby是正确的主意,但正确的方法是cumcount
:
The groupby is the right idea, but the right method is cumcount
:
>>> product_df['month_num'] = product_df.groupby('product_desc').cumcount()
>>> product_df
product_desc activity_month prod_count pct_ch month_num
0 product_a 2014-01-01 53 NaN 0
3 product_a 2014-02-01 52 -0.018868 1
6 product_a 2014-03-01 50 -0.038462 2
1 product_b 2014-01-01 44 NaN 0
4 product_b 2014-02-01 43 -0.022727 1
7 product_b 2014-03-01 41 -0.046512 2
2 product_c 2014-01-01 36 NaN 0
5 product_c 2014-02-01 35 -0.027778 1
8 product_c 2014-03-01 34 -0.028571 2
如果您真的希望它以1开头,那么只需执行以下操作即可:
If your really want it to start with 1 then just do this instead:
>>> product_df['month_num'] = product_df.groupby('product_desc').cumcount() + 1
product_desc activity_month prod_count pct_ch month_num
0 product_a 2014-01-01 53 NaN 1
3 product_a 2014-02-01 52 -0.018868 2
6 product_a 2014-03-01 50 -0.038462 3
1 product_b 2014-01-01 44 NaN 1
4 product_b 2014-02-01 43 -0.022727 2
7 product_b 2014-03-01 41 -0.046512 3
2 product_c 2014-01-01 36 NaN 1
5 product_c 2014-02-01 35 -0.027778 2
8 product_c 2014-03-01 34 -0.028571 3
这篇关于如何在DataFrame中的groupby中增加行数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文