pandas :添加缺少月份的数据 [英] Pandas: Add data for missing months

查看:57
本文介绍了 pandas :添加缺少月份的数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个按月显示的按客户划分的销售信息数据框,看起来像这样,有多个客户,不同的月经和花费:

I have a dataframe of sales information by customers by month period, that looks something like this, with multiple customers and varying month periods and spend:

      customer_id month_year      sales
   0        12    2012-05          2.58   
   1        12    2011-07         33.14  
   2        12    2011-11        182.06   
   3        12    2012-03        155.32   
   4        12    2012-01         71.24 

如您所见,对于每个客户来说,许多个月都失踪了.我想为month_year范围内的所有月份的每位客户添加额外的行,其中sales = 0.0.

As you can see, for each customer many of the months are missing. I would like to add additional rows for each customer, with sales = 0.0, for all of the months in the range of month_year.

任何人都可以建议最好的方法吗?

Can anyone advise the best way to do this?

推荐答案

类似的东西;请注意,未定义customer_id的填充(因为您可能在groupby之类的东西中有此填充).

Something like this; note that the filling the customer_id is not defined (as you probably have this in a groupby or something).

如果需要,您可能需要在最后添加reset_index

You may need a reset_index at the end (if desired)

In [130]: df2 = df.set_index('month_year')

In [131]: df2 = df2.sort_index()

In [132]: df2
Out[132]: 
            customer_id   sales
month_year                     
2011-07              12   33.14
2011-11              12  182.06
2012-01              12   71.24
2012-03              12  155.32
2012-05              12    2.58

In [133]: df2.reindex(pd.period_range(df2.index[0],df2.index[-1],freq='M'))
Out[133]: 
         customer_id   sales
2011-07           12   33.14
2011-08          NaN     NaN
2011-09          NaN     NaN
2011-10          NaN     NaN
2011-11           12  182.06
2011-12          NaN     NaN
2012-01           12   71.24
2012-02          NaN     NaN
2012-03           12  155.32
2012-04          NaN     NaN
2012-05           12    2.58

In [135]: df2['customer_id'] = 12

In [136]: df2.fillna(0.0)
Out[136]: 
         customer_id   sales
2011-07           12   33.14
2011-08           12    0.00
2011-09           12    0.00
2011-10           12    0.00
2011-11           12  182.06
2011-12           12    0.00
2012-01           12   71.24
2012-02           12    0.00
2012-03           12  155.32
2012-04           12    0.00
2012-05           12    2.58

这篇关于 pandas :添加缺少月份的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆