按 pandas 分组计算逐年增长 [英] Calculating year over year growth by group in Pandas
问题描述
我有以下的 dataframe
:
In [1] :df
Out [1]:
ID月份Transaction_Amount
1 2013/01 10
1 2013/02 20
1 2013 / 03 10
1 2013/04 20
1 2013/05 10
1 2013/06 20
1 2013/07 10
1 2013/08 20
1 2013/09 10
1 2013/10 20
1 2013/11 10
1 2013/12 20
1 2014/01 15
1 2014 / 02 25
1 2014/03 15
1 2014/04 25
...
1 2014/11 15
1 2014/12 25
。 ..
10000000 2014/11 13
10000000 2014/12 23
什么我想d o是计算滚动月份期间同比增长的增长率,例如,我希望找到(2014/01 - 2013/01)/(2014/01)
(15 - 10)/(15)= 1/3
,并在第一个滚动期间保存。每个 ID
将总共有12个滚动周期。我认为最终的输出应该如下所示:
In [2]:df_new
Out [2] :
ID rolling_period_1 rolling_period_2 ... rolling_period_12
1.333333 .25 .25
2 x1 x2 x12
3 y1 y2 y12
4 z1 z2 z12
...
我生成了一个包含每年元组的列表超过期限 [(2013/01,2014/01),(2013/02,2014/02)...(2013/12,2014/12)]
并且拥有一直在玩 isin
来索引原始 df
的一个子集,但我不确定如何到达 df_new
。
编辑
我使用以下代码创建了一个名为 temp_df
的新数据框:
在[4]中:temp_df = df [df ['month']。isin(('2013/01 ,'2014/01'))]
在[5]中:temp_df
输出[5]:
ID月份Transaction_Amount
1 2013/01 10
1 2014/01 15
2 2013/01 20
2 2014/01 30
3 2013/01 15
3 2014/01 30
...
我想制作的是 DataFrame
,如下所示:
在[6]中:new_df
Out [6]:
ID交易增长
1.3333#(15-10)/ 15
2 .3333#(30-20)/ 30
3.50#(30-15)/ 30
...
您可以使用shift来偏移数据框中的行。
创建具有月份列和值列的虚拟数据
b$ b
rng = pd.date_range('1/1/2011',句号= 90,freq ='M' )
df = pd.DataFrame({'value':range(1,91),'date':rng})
设置月份列索引
df = df.set_index('date')
移动数据框12个周期的副本以获取12个月前的值,减去当前记录,除以当前记录:
df - df.shift(12)/ df
更新换班功能期间的符号
更新考虑ID
#创建月份范围
rng = pd.date_range ('1/1/2011',句号= 180,freq ='M')
ID = np.array([1,2,3])
#创建ID列
ID = np.repeat(ID,60)
#在数据框中创建虚拟数据
df = pd.DataFrame({'ID':ID,'value':range 1,181),'date':rng})
#在对象组中使用shift对象
(df.value - df.groupby(['ID']).value.shift 12))/ df.value
I have the following dataframe
:
In [1]: df
Out[1]:
ID Month Transaction_Amount
1 2013/01 10
1 2013/02 20
1 2013/03 10
1 2013/04 20
1 2013/05 10
1 2013/06 20
1 2013/07 10
1 2013/08 20
1 2013/09 10
1 2013/10 20
1 2013/11 10
1 2013/12 20
1 2014/01 15
1 2014/02 25
1 2014/03 15
1 2014/04 25
...
1 2014/11 15
1 2014/12 25
...
10000000 2014/11 13
10000000 2014/12 23
What I would like to do is calculate the growth over rolling month periods year over year, so for example, I would want to find the value of (2014/01 - 2013/01) / (2014/01)
which is (15 - 10) / (15) = 1/3
and save this for the first rolling period. There will be a total of 12 rolling periods for each ID
. I'm thinking that the final output should look like:
In [2]: df_new
Out[2]:
ID rolling_period_1 rolling_period_2 ... rolling_period_12
1 .333333 .25 .25
2 x1 x2 x12
3 y1 y2 y12
4 z1 z2 z12
...
I generated a list containing tuples of every year over period [(2013/01, 2014/01), (2013/02, 2014/02) ... (2013/12, 2014/12)]
and have been playing around with isin
to index a subset of the original df
, but I am unsure how to arrive at the df_new
.
EDIT
I have created a new dataframe called temp_df
with the following code:
In [4]: temp_df = df[df['month'].isin(('2013/01','2014/01'))]
In [5]: temp_df
Out[5]:
ID Month Transaction_Amount
1 2013/01 10
1 2014/01 15
2 2013/01 20
2 2014/01 30
3 2013/01 15
3 2014/01 30
...
What I would like to produce is a DataFrame
that looks like the following:
In [6]: new_df
Out[6]:
ID Transaction_Growth
1 .3333 # (15-10)/15
2 .3333 # (30-20)/30
3 .50 # (30-15)/30
...
you can use shift to offset the rows in the dataframe.
Create dummy data with month column and values column
rng = pd.date_range('1/1/2011', periods=90, freq='M')
df = pd.DataFrame({'value':range(1,91),'date':rng})
set the month column to index
df = df.set_index('date')
shift a copy of the dataframe 12 periods to get the value 12 months ago, minus from the current record, and divide by current record:
df - df.shift(12)/ df
update the sign on the period in the shift function
Updated to consider ID
# Create range of months
rng = pd.date_range('1/1/2011', periods=180, freq='M')
ID = np.array([1,2,3])
# Create ID column
ID = np.repeat(ID,60)
# Create dummy data in dataframe
df = pd.DataFrame({'ID':ID,'value':range(1,181),'date':rng})
# Use shift on a group by object
(df.value - df.groupby(['ID']).value.shift(12))/ df.value
这篇关于按 pandas 分组计算逐年增长的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!