Pandas/Python复杂的条件和 [英] Pandas/Python complex, conditional sum
问题描述
df_have
CONTRACT ID AMT REL_NUM HDR_NUM
1 1 0.00 0 1
1 2 33.85 1 2
1 3 0.72 2 2
1 4 0.87 1 1
1 5 1.67 1 2
df_want
CONTRACT ID AMT REL_NUM HDR_NUM CALCULATION
1 1 0.00 0 1 (0.00+33.85+0.87+1.67)
1 2 33.85 1 2 (33.85+0.72)
1 3 0.72 2 2 (33.85+0.72)
1 4 0.87 1 1 (0.00+33.85+0.87+1.67)
1 5 1.67 1 2 (33.85+0.72)
df_getting
df_getting
CONTRACT ID AMT REL_NUM HDR_NUM CALCULATION
1 1 0.00 0 1 21.75
1 2 33.85 1 2 2.00
1 3 0.72 2 2 19.75
1 4 0.87 1 1 33.85
1 5 1.67 1 2 0.00
尝试创建新列"CALCULATION",但逻辑有些棘手.计算应为AMT字段的总和,具体取决于CONTRACT,ID,REL_NUM和HDR_NUM字段.
Trying to create a new column 'CALCULATION' but the logic is a bit tricky. The calculation should be the SUM of the AMT field dependent on the CONTRACT, ID, REL_NUM and HDR_NUM fields.
第1步-检查HDR_NUM字段并获取ID = HDR_NUM并且CONTRACT字段相同的相应AMT值
Step 1- Check the HDR_NUM field and get the corresponding AMT value where ID=HDR_NUM AND the CONTRACT field is the same
第2步-在所有AMT字段中添加相同合同的REL_NUM = HDR_NUM
Step 2- Add in all AMT fields where the REL_NUM= HDR_NUM for the same CONTRACT
对于第一行,这将对ID = 2,3的AMT字段求和,对于CONTRACT = 1的AMT字段求和. 对于第6行,这将对ID = 2,4(对于CONTRACT = 2)的AMT字段求和
For the first row, this would be summing the AMT fields for ID= 2,3 and 4 for CONTRACT =1. For row 6, this would be summing the AMT fields for ID=2,4 for CONTRACT=2
一个警告是不应进行重复计数(即对于第6行,将ID = 2,4的AMT字段总计为CONTRACT = 2的,请勿重复计算ID = 2)
One caveat is that double counting should not occur (i.e. for row 6, sum the AMT fields for ID=2,4 for CONTRACT=2, do not double count ID=2)
推荐答案
IIUC,
def F(s):
rule1 = s[['ID', 'AMT']].set_index('ID').to_dict()['AMT']
rule2 = s[['REL_NUM', 'AMT']].groupby('REL_NUM').sum().to_dict()['AMT']
s1 = s['HDR_NUM'].astype(int).map(rule1).fillna(0)
s2 = s['HDR_NUM'].astype(int).map(rule2).fillna(0)
return s1 + s2
df['CALCULATION'] = df.groupby('CONTRACT').apply(F).values.ravel()
CONTRACT ID AMT REL_NUM HDR_NUM CALCULATION
0 1 1 0.00 0 1 36.39
1 1 2 33.85 1 2 34.57
2 1 3 0.72 2 2 34.57
3 1 4 0.87 1 1 36.39
4 1 5 1.67 1 2 34.57
这篇关于Pandas/Python复杂的条件和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!