用 pandas 计算增量列 [英] Compute delta column with Pandas
本文介绍了用 pandas 计算增量列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个如下数据框:
Name Variable Field
A 2.3 412
A 2.9 861
A 3.5 1703
B 3.5 1731
A 4.0 2609
B 4.0 2539
A 4.6 2821
B 4.6 2779
A 5.2 3048
B 5.2 2979
A 5.8 3368
B 5.8 3216
如您所见,我在变量"列中有重复的值. 我想为A和B之间的每个变量计算增量(%). 然后,我要生成的数据框是:
As you can see I have duplicate values for the "variable" column. I would like to compute the delta (%) for each of this variable between A and B. The dataframe that I want to generate is then :
Name Variable Field Ref field (A) Delta (A - B)
A 2.3 412 412 0.0%
A 2.9 861 861 0.0%
A 3.5 1703 1703 0.0%
B 3.5 1731 1703 -1.6%
A 4.0 2609 2609 0.0%
B 4.0 2539 2609 2.8%
A 4.6 2821 2821 0.0%
B 4.6 2779 2821 1.5%
A 5.2 3048 3048 0.0%
B 5.2 2979 3048 2.3%
A 5.8 3368 3368 0.0%
B 5.8 3216 3368 4.7%
我已经尝试过用熊猫做一些事情,例如:
I tried a few things with panda already, like :
df["Ref field (A)"] = df.apply(lambda row:df[(df["Variable"] == row["Variable"]) & (df["Name"] == "A")]["Field"][0],axis=1)
但是它根本不起作用...:
But it just doesn't work... :
File "pandas/_libs/index.pyx", line 106, in pandas._libs.index.IndexEngine.get_value
File "pandas/_libs/index.pyx", line 114, in pandas._libs.index.IndexEngine.get_value
File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 964, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: (0, u'occurred at index 0')
有什么简单可行的想法吗? 谢谢
Any idea of something simple that can work ? Thank you
推荐答案
每个'Variable'
组只有一个'A'
值,请创建一个Series
并映射这些值以获取引用.
With only one 'A'
value per 'Variable'
group, create a Series
and map the values to get the reference.
s = df[df.Name.eq('A')].set_index('Variable').Field
df['RefA'] = df.Variable.map(s)
df['Delta'] = (df.RefA - df.Field)/df.Field*100
输出:(仅在一个B组和一个C组的末尾添加一行)
Name Variable Field RefA Delta
0 A 2.3 412 412.0 0.000000
1 A 2.9 861 861.0 0.000000
2 A 3.5 1703 1703.0 0.000000
3 B 3.5 1731 1703.0 -1.617562
4 C 3.5 1761 1703.0 -3.293583
5 A 4.0 2609 2609.0 0.000000
6 B 4.0 2539 2609.0 2.756991
7 A 4.6 2821 2821.0 0.000000
8 B 4.6 2779 2821.0 1.511335
9 A 5.2 3048 3048.0 0.000000
10 B 5.2 2979 3048.0 2.316213
11 A 5.8 3368 3368.0 0.000000
12 B 5.8 3216 3368.0 4.726368
13 B 6.5 1231 NaN NaN
这篇关于用 pandas 计算增量列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文