在 pandas 数据框中基于尾随行进行计算 [英] doing calculations in pandas dataframe based on trailing row
问题描述
是否可以基于不同列中的尾随行在pandas数据框中进行计算?像这样的东西.
Is it possible to do calculations in pandas dataframe based on trailing rows in a different column? Something like this.
frame = pd.DataFrame({'a' : [True, False, True, False],
'b' : [25, 22, 55, 35]})
我希望输出为:
A B C
True 25
False 22 44
True 55 55
False 35 70
当A列中的跟踪行为False时,C列与B列相同; C列为B列* 2的情况下,当C列为B列时, A是真的吗?
Where column C is the same as column B when the trailing row in column A is False and where column C is column B * 2 when the trailing row in column A is True?
推荐答案
You could use the where
Series method:
In [11]: frame['b'].where(frame['a'], 2 * frame['b'])
Out[11]:
0 25
1 44
2 55
3 70
Name: b, dtype: int64
In [12]: frame['c'] = frame['b'].where(frame['a'], 2 * frame['b'])
或者,您可以使用 apply
(但这通常会比较慢):
Alternatively you could use apply
(but this will usually be slower):
In [21]: frame.apply(lambda x: 2 * x['b'] if x['a'] else x['b'], axis=1
由于您使用的是尾随行",因此您将需要使用 shift
:
Since you are using the "trailing row" you are going to need to use shift
:
In [31]: frame['a'].shift()
Out[31]:
0 NaN
1 True
2 False
3 True
Name: a, dtype: object
In [32]: frame['a'].shift().fillna(False) # actually this is not needed, but perhaps clearer
Out[32]:
0 False
1 True
2 False
3 True
Name: a, dtype: object
然后使用相反的方式:
In [33]: c = (2 * frame['b']).where(frame['a'].shift().fillna(False), frame['b'])
In [34]: c
Out[34]:
0 25
1 44
2 55
3 70
Name: b, dtype: int64
并在熊猫中更改第一行(例如,更改为NaN,我们使用NaN来丢失数据)
and to change the first row (e.g. to NaN, in pandas we use NaN for missing data)
In [35]: c = c.astype(np.float) # needs to accept NaN
In [36]: c.iloc[0] = np.nan
In [36]: frame['c'] = c
In [37]: frame
Out[37]:
a b c
0 True 25 NaN
1 False 22 44
2 True 55 55
3 False 35 70
这篇关于在 pandas 数据框中基于尾随行进行计算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!