python2.7:更改dataframe的列值的差异 [英] python2.7: change difference for a column value of dataframe
问题描述
我有一个如下所示的数据帧(df)(仅作为示例),也许有10个或更多数据帧:
日期ab
$ p $我想计算数据框中b列值的变化百分比。
0 2010-01-01 12 15
1 2010-01-02 13 20
2 2010-01-03 14 23
3 2010-01-04 15 24
4 2010-01-05 16 25
5 2010-01-08 17 15
6 2010-01-09 180160
..... ......................
1000 2013-01-05 310320
但有一个例外,当日期为'2010-01-09'(仅作为示例),并计算b'2010-01-09'的变化百分比时,b在'2010- 01-08应为10倍,仅此一次,其他日期应使用原始值,我的意思是没有10倍。通常,我通过以下代码计算更改百分比:df ['b_diff'] = df2 ['b'] /(df2 ['b']。shift()-1
但是当日期为:' 2010-01-09'。
我认为代码应为:df ['b_diff'] = df2 ['b'] / 10 *(df2 ['b']。shift())-1
您能告诉我如何处理此问题吗?
谢谢!
解决方案您可以使用
pct_change
,但首先将b
的值除以条件:dates = ['2010-01-09','2011-01-09']
m = df2 ['date']。isin(dates)
df2 .loc [m,'b'] = df2 ['b'] / 10
df2 ['b_diff'] = df2 ['b']。pct_change()
打印(df2 )
日期ab b_diff
0 2010-01-01 12 15.0 NaN
1 2010-01-02 13 20。 0 0.333333
2 2010-01-03 14 23.0 0.150000
3 2010-01-04 15 24.0 0.043478
4 2010-01-05 16 25.0 0.041667
5 2010-01- 08 17 15.0 -0.400000
6 2010-01-09 180 16.0 0.066667
替代解决方案:
dates = ['2010-01-09','2011-01-09']
m = df2 [ 'date']。isin(dates)
df2 ['b'] = df2 ['b']。mask(m,df2 ['b'] / 10)
df2 [ 'b_diff'] = df2 ['b']。pct_change()
print(df2)
date ab b_diff
0 2010-01-01 12 15.0 NaN
1 2010- 01-02 13 20.0 0.333333
2 2010-01-03 14 23.0 0.150000
3 2010-01-04 15 24.0 0.043478
4 2010-01-05 16 25.0 0.041667
5 2010-01-08 17 15.0 -0.400000
6 2010-01-09 180 16.0 0.066667
I have a dataframe(df) like as following(just example), there are maybe 10 or more dataframes:
date a b 0 2010-01-01 12 15 1 2010-01-02 13 20 2 2010-01-03 14 23 3 2010-01-04 15 24 4 2010-01-05 16 25 5 2010-01-08 17 15 6 2010-01-09 180 160 ................................ 1000 2013-01-05 310 320
I want to calculate the change percentage of b column value in the dataframe. But there is a exception that when the date is '2010-01-09' (just a example), and calculate the change percentage of b '2010-01-09' , the value of b in'2010-01-08' should be 10 times, just this time, other dates should use the original value, I mean no 10 times. In generally, I calculate the change percent by the following code:
df['b_diff'] = df2['b']/(df2['b'].shift() -1
But when the date is: '2010-01-09'. I think the code should be:
df['b_diff'] = df2['b']/10*(df2['b'].shift()) -1
Could you tell me how to process with this issue?
Thanks!
解决方案You can use
pct_change
, but first divide value ofb
by condition:dates = ['2010-01-09','2011-01-09'] m = df2['date'].isin(dates) df2.loc[m, 'b'] = df2['b'] / 10 df2['b_diff'] = df2['b'].pct_change() print (df2) date a b b_diff 0 2010-01-01 12 15.0 NaN 1 2010-01-02 13 20.0 0.333333 2 2010-01-03 14 23.0 0.150000 3 2010-01-04 15 24.0 0.043478 4 2010-01-05 16 25.0 0.041667 5 2010-01-08 17 15.0 -0.400000 6 2010-01-09 180 16.0 0.066667
Alternative solution:
dates = ['2010-01-09','2011-01-09'] m = df2['date'].isin(dates) df2['b'] = df2['b'].mask(m, df2['b'] / 10) df2['b_diff'] = df2['b'].pct_change() print (df2) date a b b_diff 0 2010-01-01 12 15.0 NaN 1 2010-01-02 13 20.0 0.333333 2 2010-01-03 14 23.0 0.150000 3 2010-01-04 15 24.0 0.043478 4 2010-01-05 16 25.0 0.041667 5 2010-01-08 17 15.0 -0.400000 6 2010-01-09 180 16.0 0.066667
这篇关于python2.7:更改dataframe的列值的差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!