pandas :计算当前列值和下一列值之间的差值,具体取决于它是否满足其他列的条件 [英] Pandas: Calculating value of difference between current column value and next column value depending if it meets criteria at a different column

查看:88
本文介绍了 pandas :计算当前列值和下一列值之间的差值,具体取决于它是否满足其他列的条件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框:

df = pd.DataFrame.from_items([('A', [10, 'foo']), ('B', [440, 'foo']), ('C', [790, 'bar']), ('D', [800, 'bar']), ('E', [7000, 'foo'])], orient='index', columns=['position', 'foobar'])

如下所示:

    position foobar
A   10       foo
B   440      foo
C   790      bar
D   800      bar
E   7000     foo

我想知道每个位置与在foobar列中具有相反值的下一个位置之间的差异.通常,我会使用shift方法将position列向下移动:

I would like to know the difference between each position and the next position that has the opposite value in the foobar column. Normally I would use the shift method to move down the position column:

df[comparisonCol].shift(-1) - df[comparisonCol]

但是由于我正在使用foobar列来确定哪个位置适用,因此我不确定如何执行此操作.

but as I am using the foobar column to decide which position is applicable, I am not sure how to do this.

结果应如下所示:

    position foobar difference
A   10       foo      780
B   440      foo      350
C   790      bar      6210
D   800      bar      6200
E   7000     foo      NaN

推荐答案

我想,如果foobar中的唯一值只有2,那么您就需要这样做,所以a系列中的组之间可能会发生移动:

I think you need if unique values in foobar are only 2, so is possible shift between groups in a Series:

#identify consecutive groups
a = df['foobar'].ne(df['foobar'].shift()).cumsum()
print (a)
A    1
B    1
C    2
D    2
E    3
Name: foobar, dtype: int32

#get first value by a of position column
b = df.groupby(a)['position'].first()
print (b)
foobar
1      10
2     790
3    7000
Name: position, dtype: int64

#subtract mapped value, but for next group is added 1 to a Series
df['difference'] = a.add(1).map(b) - df['position']
print (df)
   position foobar  difference
A        10    foo       780.0
B       440    foo       350.0
C       790    bar      6210.0
D       800    bar      6200.0
E      7000    foo         NaN

详细信息:

print (a.add(1).map(b))
A     790.0
B     790.0
C    7000.0
D    7000.0
E       NaN
Name: foobar, dtype: float64

这篇关于 pandas :计算当前列值和下一列值之间的差值,具体取决于它是否满足其他列的条件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆