按 pandas 分组计算逐年增长 [英] Calculating year over year growth by group in Pandas

查看：139 发布时间：2018/4/17 17:53:50 python function indexing pandas

本文介绍了按 pandas 分组计算逐年增长的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有以下的 dataframe ：

  In [1] ：df 
 Out [1]：
 
 ID月份Transaction_Amount 
 
 1 2013/01 10 
 1 2013/02 20 
 1 2013 / 03 10 
 1 2013/04 20 
 1 2013/05 10 
 1 2013/06 20 
 1 2013/07 10 
 1 2013/08 20 
 1 2013/09 10 
 1 2013/10 20 
 1 2013/11 10 
 1 2013/12 20 
 1 2014/01 15 
 1 2014 / 02 25 
 1 2014/03 15 
 1 2014/04 25 
 ... 
 1 2014/11 15 
 1 2014/12 25 
。 .. 
 10000000 2014/11 13 
 10000000 2014/12 23

什么我想d o是计算滚动月份期间同比增长的增长率，例如，我希望找到（2014/01 - 2013/01）/（2014/01） （15 - 10）/（15）= 1/3 ，并在第一个滚动期间保存。每个 ID 将总共有12个滚动周期。我认为最终的输出应该如下所示：

  In [2]：df_new 
 Out [2] ：
 
 ID rolling_period_1 rolling_period_2 ... rolling_period_12 
 
 1.333333 .25 .25 
 2 x1 x2 x12 
 3 y1 y2 y12 
 4 z1 z2 z12 
 ...

我生成了一个包含每年元组的列表超过期限 [（2013/01，2014/01），（2013/02，2014/02）...（2013/12，2014/12）] 并且拥有一直在玩 isin 来索引原始 df 的一个子集，但我不确定如何到达 df_new 。

编辑

我使用以下代码创建了一个名为 temp_df 的新数据框：

在[4]中：temp_df = df [df ['month']。isin（（'2013/01 ，'2014/01'））] 在[5]中：temp_df 输出[5]： ID月份Transaction_Amount 1 2013/01 10 1 2014/01 15 2 2013/01 20 2 2014/01 30 3 2013/01 15 3 2014/01 30 ...
我想制作的是 DataFrame ，如下所示：
在[6]中：new_df Out [6]： ID交易增长 1.3333＃（15-10）/ 15 2 .3333＃（30-20）/ 30 3.50＃（30-15）/ 30 ...

解决方案

您可以使用shift来偏移数据框中的行。

创建具有月份列和值列的虚拟数据
b
$ b
rng = pd.date_range（'1/1/2011'，句号= 90，freq ='M' ） df = pd.DataFrame（{'value'：range（1,91），'date'：rng}）
设置月份列索引
df = df.set_index（'date'）
移动数据框12个周期的副本以获取12个月前的值，减去当前记录，除以当前记录：

df - df.shift（12）/ df
更新换班功能期间的符号

更新考虑ID
＃创建月份范围 rng = pd.date_range （'1/1/2011'，句号= 180，freq ='M'） ID = np.array（[1,2,3]）＃创建ID列 ID = np.repeat（ID，60）＃在数据框中创建虚拟数据 df = pd.DataFrame（{'ID'：ID，'value'：range 1,181），'date'：rng}）＃在对象组中使用shift对象（df.value - df.groupby（['ID']）.value.shift 12））/ df.value

I have the following dataframe:
In [1]: df Out[1]: ID Month Transaction_Amount 1 2013/01 10 1 2013/02 20 1 2013/03 10 1 2013/04 20 1 2013/05 10 1 2013/06 20 1 2013/07 10 1 2013/08 20 1 2013/09 10 1 2013/10 20 1 2013/11 10 1 2013/12 20 1 2014/01 15 1 2014/02 25 1 2014/03 15 1 2014/04 25 ... 1 2014/11 15 1 2014/12 25 ... 10000000 2014/11 13 10000000 2014/12 23
What I would like to do is calculate the growth over rolling month periods year over year, so for example, I would want to find the value of (2014/01 - 2013/01) / (2014/01) which is (15 - 10) / (15) = 1/3 and save this for the first rolling period. There will be a total of 12 rolling periods for each ID. I'm thinking that the final output should look like:
In [2]: df_new Out[2]: ID rolling_period_1 rolling_period_2 ... rolling_period_12 1 .333333 .25 .25 2 x1 x2 x12 3 y1 y2 y12 4 z1 z2 z12 ...
I generated a list containing tuples of every year over period [(2013/01, 2014/01), (2013/02, 2014/02) ... (2013/12, 2014/12)] and have been playing around with isin to index a subset of the original df, but I am unsure how to arrive at the df_new.

EDIT

I have created a new dataframe called temp_df with the following code:
In [4]: temp_df = df[df['month'].isin(('2013/01','2014/01'))] In [5]: temp_df Out[5]: ID Month Transaction_Amount 1 2013/01 10 1 2014/01 15 2 2013/01 20 2 2014/01 30 3 2013/01 15 3 2014/01 30 ...
What I would like to produce is a DataFrame that looks like the following:
In [6]: new_df Out[6]: ID Transaction_Growth 1 .3333 # (15-10)/15 2 .3333 # (30-20)/30 3 .50 # (30-15)/30 ...

解决方案
you can use shift to offset the rows in the dataframe.

Create dummy data with month column and values column
rng = pd.date_range('1/1/2011', periods=90, freq='M') df = pd.DataFrame({'value':range(1,91),'date':rng})
set the month column to index
df = df.set_index('date')
shift a copy of the dataframe 12 periods to get the value 12 months ago, minus from the current record, and divide by current record:
df - df.shift(12)/ df
update the sign on the period in the shift function

Updated to consider ID
# Create range of months rng = pd.date_range('1/1/2011', periods=180, freq='M') ID = np.array([1,2,3]) # Create ID column ID = np.repeat(ID,60) # Create dummy data in dataframe df = pd.DataFrame({'ID':ID,'value':range(1,181),'date':rng}) # Use shift on a group by object (df.value - df.groupby(['ID']).value.shift(12))/ df.value

这篇关于按 pandas 分组计算逐年增长的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

按 pandas 分组计算逐年增长 [英] Calculating year over year growth by group in Pandas

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

按 pandas 分组计算逐年增长 [英] Calculating year over year growth by group in Pandas

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭