根据另一列的值在 pandas 数据框的列中查找模式 [英] look for patterns in a column of pandas dataframe based on the value of other column

查看:78
本文介绍了根据另一列的值在 pandas 数据框的列中查找模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据框:

在 key==1 的每一行中,我想在 s_w 列中搜索该行之前和之后出现的两次 1(其中 key==1),然后对这些行的 v 值求和并将其放入一个新的X 列.这些 1 的出现不一定是连续的,s_w 列中的 is 之间可能存在间隙,例如 11....11 或 101....10001,但如果我们未能在 s_w 列中找到两个 1在该行之前或之后(其中 key==1)然后我们将 NaN 放在 X 列中.对于 key==0 的行也是 NaN.

in each row where key==1, I would like to search s_w column for two occurrences of 1 before and after that row( where key==1) then sum value of v for those rows and put it in a new column X. These occurrences of 1s should not be necessarily successive, there can be a gap between is in s_w column for example 11....11 or 101....10001, but if we fail to find two 1s in s_w column in either before or after that row ( where key==1) then we put NaN in X column. also NaN for rows where key==0 .

一个新的数据框来测试解决方案是否可以很好地概括:

a new dataframe to test if solution generalize well:

 df = pd.DataFrame( { "p":[1,1,1,1,1,1,1,1,1,1,1,1,1],
                 "l" :[1,1,1,1,1,1,1,1,1,1,1,1,1],
                 "w":[1,2,3,4,5,6,7,8,9,10,11,12,12],
                 "s_w":[1,1,0,0,0,0,1,0,0,0,0,1,1],
                 "key" :[1,1,0,0,0,1,0,1,0,0,0,0,1],
                 "v":[2,2,5,3,4,5,5,1,2,3,4,5,4]
               })

推荐答案

我认为这里只有通过 Series.where 添加到 之前的答案:

I think here is necessary add mask only by Series.where added to previous answer:

g = df[df['s_w'].eq(1)].groupby(['p','l'])['v']
df['c_s'] = g.shift(-1).add(g.shift(-2)).add(g.shift(2)).add(g.shift(1)).where(df['key'].eq(1))


print (df)
    p  l   w  s_w  key  v   c_s
0   1  1   1    1    1  2   NaN
1   1  1   2    1    1  2   NaN
2   1  1   3    0    0  5   NaN
3   1  1   4    0    0  3   NaN
4   1  1   5    0    0  4   NaN
5   1  1   6    1    1  5  10.0 <- 2 + 2 + 5 + 1
6   1  1   7    1    0  5   NaN
7   1  1   8    1    1  1  19.0 <- 5 + 5 + 5 + 4
8   1  1   9    0    0  2   NaN
9   1  1  10    0    0  3   NaN
10  1  1  11    0    0  4   NaN
11  1  1  12    1    0  5   NaN
12  1  1  12    1    1  4   NaN

这篇关于根据另一列的值在 pandas 数据框的列中查找模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆