如果当前 cumsum 大于特定值,则从下一个 cumsum 中减去 1 - pandas 或 numpy [英] subtract 1 from next cumsum if current cumsum more than a particular value - pandas or numpy
问题描述
我有一个如下所示的数据框
I have a data frame as shown below
B_ID Session no_show cumulative_no_show
1 s1 0.4 0.4
2 s1 0.6 1.0
3 s1 0.2 1.2
4 s1 0.1 1.3
5 s1 0.4 1.7
6 s1 0.2 1.9
7 s1 0.3 2.2
10 s2 0.3 0.3
11 s2 0.4 0.7
12 s2 0.3 1.0
13 s2 0.6 1.6
14 s2 0.2 1.8
15 s2 0.5 2.3
其中cumulative_no_show 是no_show 的累积总和.
where cumulative_no_show is the cumulative sum of no_show.
从上面我想根据以下条件创建一个名为 u_no_show 的新列.
From the above I would like to create a new column called u_no_show based on below condition.
只要cumulative_no_show >= 0.8,就从下一个cumulative_no_show 中减去1.等等.
Whenever cumulative_no_show >= 0.8, then subtract 1 from next cumulative_no_show. and so on.
预期输出:
B_ID Session no_show cumulative_no_show u_no_show
1 s1 0.4 0.4 0.4
2 s1 0.6 1.0 1.0
3 s1 0.2 1.2 0.2
4 s1 0.1 1.3 0.3
5 s1 0.4 1.7 0.7
6 s1 0.2 1.9 0.9
7 s1 0.3 2.2 0.2
10 s2 0.3 0.3 0.3
11 s2 0.4 0.7 0.7
12 s2 0.3 1.0 1.0
13 s2 0.6 1.6 0.6
14 s2 0.2 1.8 1.8
15 s2 0.5 2.3 0.3
推荐答案
我假设您希望每个会话都执行此操作.我不确定是否有矢量化解决方案,所以我会通过创建一个函数来迭代这些值并在需要时进行减法,然后使用 groupby.apply
:
I assume you want to perform this per Session. I'm not sure there is a vectorized solution so I would go by creating a function that iterate over the values and do the subtraction when needed, then use groupby.apply
:
def create_u_no_show (ser):
# convert to numpy aray and iterate
arr_ns = ser.to_numpy()
for i in range(len(arr_ns)-1):
# check if the condition is met
if arr_ns[i]>0.8:
# remove 1 to all the next values if the condition is met
arr_ns[i+1:] -= 1
# return a serie with the right index
return pd.Series(arr_ns, ser.index)
df['u_no_show'] = df.groupby(['Session'])['cumulative_no_show'].apply(create_u_no_show)
print (df)
B_ID Session no_show cumulative_no_show u_no_show
0 1 s1 0.4 0.4 0.4
1 2 s1 0.6 1.0 1.0
2 3 s1 0.2 1.2 0.2
3 4 s1 0.1 1.3 0.3
4 5 s1 0.4 1.7 0.7
5 6 s1 0.2 1.9 0.9
6 7 s1 0.3 2.2 0.2
7 10 s2 0.3 0.3 0.3
8 11 s2 0.4 0.7 0.7
9 12 s2 0.3 1.0 1.0
10 13 s2 0.6 1.6 0.6
11 14 s2 0.2 1.8 0.8
12 15 s2 0.5 2.3 1.3
这篇关于如果当前 cumsum 大于特定值,则从下一个 cumsum 中减去 1 - pandas 或 numpy的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!