根据pct_change和 pandas 中的先前值计算当前值 [英] Caculate current values based on pct_change and previous values in Pandas

查看:152
本文介绍了根据pct_change和 pandas 中的先前值计算当前值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于以下数据框:

   type    price       pct      date
0     a  10918.0       NaN  2019/6/1
1     a      NaN  0.023631  2019/9/1
2     b  10379.0       NaN  2019/6/1
3     b      NaN  0.010984  2019/9/1
4     c   9466.0       NaN  2019/6/1
5     c      NaN  0.177160  2019/9/1
6     d  13637.0       NaN  2019/6/1
7     d      NaN  0.124661  2019/9/1
8     e  11774.0       NaN  2019/6/1
9     e      NaN -0.033124  2019/9/1
10    f      NaN  0.023124  2019/9/2

我希望首先过滤不重复的行:

I hope to firstly filter rows which are not duplicated with:

df = df[df.duplicated(subset=['type'], keep=False)]

然后根据pct2019/6/1中的值在2019/9/1日期中计算price.

then calculate price in the date of 2019/9/1 based on pct and values from 2019/6/1.

最终结果将如下所示:

  type  price       pct      date
0    a  10918       NaN  2019/6/1
1    a  11176  0.023631  2019/9/1
2    b  10379       NaN  2019/6/1
3    b  10493  0.010984  2019/9/1
4    c   9466       NaN  2019/6/1
5    c  11143  0.177160  2019/9/1
6    d  13637       NaN  2019/6/1
7    d  15337  0.124661  2019/9/1
8    e  11774       NaN  2019/6/1
9    e  11384 -0.033124  2019/9/1

我该怎么做?谢谢.

推荐答案

如果需要保证price in the date of 2019/9/1 based on pct and values from 2019/6/1,则可以使用MultiIndex-对于选择的列使用元组:

If need guarenteed price in the date of 2019/9/1 based on pct and values from 2019/6/1 you can working with MultiIndex - for select columns are used tuples:

df = df[df.duplicated(subset=['type'], keep=False)]
df = df.pivot_table(index='type', columns='date')
df[('price', '2019/9/1')] = (df[('pct', '2019/9/1')]*df[('price', '2019/6/1')] + 
                             df[('price', '2019/6/1')])
df = df.stack().reset_index()
print (df)
  type      date       pct         price
0    a  2019/6/1       NaN  10918.000000
1    a  2019/9/1  0.023631  11176.003258
2    b  2019/6/1       NaN  10379.000000
3    b  2019/9/1  0.010984  10493.002936
4    c  2019/6/1       NaN   9466.000000
5    c  2019/9/1  0.177160  11142.996560
6    d  2019/6/1       NaN  13637.000000
7    d  2019/9/1  0.124661  15337.002057
8    e  2019/6/1       NaN  11774.000000
9    e  2019/9/1 -0.033124  11383.998024

如果每个组始终只有2个日期时间:

If always only 2 datetimes per each group:

#removed duplicates
df = df[df.duplicated(subset=['type'], keep=False)]
#sorting for guarateed ordering
df = df.sort_values(['type','date'])

df['price'] = df['price'].ffill().mul(df['pct']).add(df['price'].ffill(), fill_value=0)
print (df)
  type         price       pct      date
0    a  10918.000000       NaN  2019/6/1
1    a  11176.003258  0.023631  2019/9/1
2    b  10379.000000       NaN  2019/6/1
3    b  10493.002936  0.010984  2019/9/1
4    c   9466.000000       NaN  2019/6/1
5    c  11142.996560  0.177160  2019/9/1
6    d  13637.000000       NaN  2019/6/1
7    d  15337.002057  0.124661  2019/9/1
8    e  11774.000000       NaN  2019/6/1
9    e  11383.998024 -0.033124  2019/9/1

这篇关于根据pct_change和 pandas 中的先前值计算当前值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆