python2.7:更改dataframe的列值的差异 [英] python2.7: change difference for a column value of dataframe

查看:145
本文介绍了python2.7:更改dataframe的列值的差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个如下所示的数据帧(df)(仅作为示例),也许有10个或更多数据帧:

 日期ab 
0 2010-01-01 12 15
1 2010-01-02 13 20
2 2010-01-03 14 23
3 2010-01-04 15 24
4 2010-01-05 16 25
5 2010-01-08 17 15
6 2010-01-09 180160
..... ......................
1000 2013-01-05 310320
但有一个例外,当日期为'2010-01-09'(仅作为示例),并计算b'2010-01-09'的变化百分比时,b在'2010- 01-08应为10倍,仅此一次,其他日期应使用原始值,我的意思是没有10倍。通常,我通过以下代码计算更改百分比:

  df ['b_diff'] = df2 ['b'] /(df2 ['b']。shift()-1 

但是当日期为:' 2010-01-09'。
我认为代码应为:

  df ['b_diff'] = df2 ['b'] / 10 *(df2 ['b']。shift())-1 

您能告诉我如何处理此问题吗?



谢谢!

解决方案

您可以使用 pct_change ,但首先将 b 的值除以条件:

  dates = ['2010-01-09','2011-01-09'] 
m = df2 ['date']。isin(dates)
df2 .loc [m,'b'] = df2 ['b'] / 10

df2 ['b_diff'] = df2 ['b']。pct_change()
打印(df2 )
日期ab b_diff
0 2010-01-01 12 15.0 NaN
1 2010-01-02 13 20。 0 0.333333
2 2010-01-03 14 23.0 0.150000
3 2010-01-04 15 24.0 0.043478
4 2010-01-05 16 25.0 0.041667
5 2010-01- 08 17 15.0 -0.400000
6 2010-01-09 180 16.0 0.066667

替代解决方案:

  dates = ['2010-01-09','2011-01-09'] 
m = df2 [ 'date']。isin(dates)

df2 ['b'] = df2 ['b']。mask(m,df2 ['b'] / 10)
df2 [ 'b_diff'] = df2 ['b']。pct_change()
print(df2)
date ab b_diff
0 2010-01-01 12 15.0 NaN
1 2010- 01-02 13 20.0 0.333333
2 2010-01-03 14 23.0 0.150000
3 2010-01-04 15 24.0 0.043478
4 2010-01-05 16 25.0 0.041667
5 2010-01-08 17 15.0 -0.400000
6 2010-01-09 180 16.0 0.066667


I have a dataframe(df) like as following(just example), there are maybe 10 or more dataframes:

     date              a       b
  0     2010-01-01     12      15
  1     2010-01-02     13      20
  2     2010-01-03     14      23
  3     2010-01-04     15      24
  4     2010-01-05     16      25
  5     2010-01-08     17      15
  6     2010-01-09     180     160
  ................................
  1000     2013-01-05     310     320

I want to calculate the change percentage of b column value in the dataframe. But there is a exception that when the date is '2010-01-09' (just a example), and calculate the change percentage of b '2010-01-09' , the value of b in'2010-01-08' should be 10 times, just this time, other dates should use the original value, I mean no 10 times. In generally, I calculate the change percent by the following code:

df['b_diff'] = df2['b']/(df2['b'].shift() -1

But when the date is: '2010-01-09'. I think the code should be:

 df['b_diff'] = df2['b']/10*(df2['b'].shift()) -1 

Could you tell me how to process with this issue?

Thanks!

解决方案

You can use pct_change, but first divide value of b by condition:

dates = ['2010-01-09','2011-01-09']
m = df2['date'].isin(dates)
df2.loc[m, 'b'] =  df2['b'] / 10

df2['b_diff'] = df2['b'].pct_change()
print (df2)
        date    a     b    b_diff
0 2010-01-01   12  15.0       NaN
1 2010-01-02   13  20.0  0.333333
2 2010-01-03   14  23.0  0.150000
3 2010-01-04   15  24.0  0.043478
4 2010-01-05   16  25.0  0.041667
5 2010-01-08   17  15.0 -0.400000
6 2010-01-09  180  16.0  0.066667

Alternative solution:

dates = ['2010-01-09','2011-01-09']
m = df2['date'].isin(dates)

df2['b'] = df2['b'].mask(m, df2['b'] / 10)
df2['b_diff'] = df2['b'].pct_change()
print (df2)
        date    a     b    b_diff
0 2010-01-01   12  15.0       NaN
1 2010-01-02   13  20.0  0.333333
2 2010-01-03   14  23.0  0.150000
3 2010-01-04   15  24.0  0.043478
4 2010-01-05   16  25.0  0.041667
5 2010-01-08   17  15.0 -0.400000
6 2010-01-09  180  16.0  0.066667

这篇关于python2.7:更改dataframe的列值的差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆