Pandas 用切片计算 CAGR [英] Pandas Calculate CAGR with Slicing

查看:42
本文介绍了Pandas 用切片计算 CAGR的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定以下数据框:

df = pd.DataFrame({'A' : ['1','2','3','7'],
                       'B' : [7,6,5,4],
                       'C' : [5,6,7,1],
                       'D' : [1,9,9,8]})
df=df.set_index('A')
df
    B   C   D
A           
1   7   5   1
2   6   6   9
3   5   7   9
7   4   1   8

我正在尝试计算 复合年增长率 (CAGR).我试图避免使用列名.这是我想出的:

I am attempting to calculate the compound annual growth rate (CAGR). I am trying to avoid using the column names. Here's what I came up with:

df['CAGR']=((df[df.columns[-1:]]/df[df.columns[:1]])**(1/len(df.columns)))-1

然而,它抛出这个错误:

However, it throws this error:

ValueError: Wrong number of items passed 2, placement implies 1

我测试了公式的每个部分,它返回了我需要的列,所以我很难过.

I tested each part of the formula and it returned the columns I needed, so I'm stumped.

提前致谢!

推荐答案

您正在对 DataFrame 进行切片,使得返回对象是 DataFrame

You are slicing the DataFrame in such a way that the return object is a DataFrame

df[df.columns[-1:]]

-1: 导致 df.columns[-1:] 返回 [column_name] 而不是 column_name.因此,df[df.columns[-1:]] 是一个 DataFrame.这意味着当您尝试进行除法时,pandas 会尝试排列索引,包括列.为了解决这个问题.你本来可以这样做的:

The -1: results in df.columns[-1:] returning [column_name] instead of column_name. As a consequence, df[df.columns[-1:]] is a DataFrame. What that means is that when you try to do the division, pandas tries to line up the indices, columns included. To get around this. You could have just done:

df[df.columns[-1]]

使用 -1 而不是 -1:

但是,我会这样做.

df['CAGR'] = df.iloc[:, -1].div(df.iloc[:, 0]).pow(1./(len(df.columns) - 1)).sub(1)

print df

   B  C  D      CAGR
A                   
1  7  5  1 -0.622036
2  6  6  9  0.224745
3  5  7  9  0.341641
7  4  1  8  0.414214

这篇关于Pandas 用切片计算 CAGR的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆