pandas 在多个条件下删除和移动列中的单元格 [英] Pandas delete and shift cells in a column basis multiple conditions

查看:93
本文介绍了 pandas 在多个条件下删除和移动列中的单元格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在某些情况下,我想删除并移动熊猫数据框中的单元格.我的数据框如下所示:

I have a situation where I would want to delete and shift cells in a pandas data frame basis some conditions. My data frame looks like this :

Value_1      ID_1      Value_2      ID_2         Value_3      ID_3
   A           1            D         1               G          1
   B           1            E         2               H          1
   C           1            F         2               I          3
   C           1            F         2               H          1

现在我想比较以下条件:

Now I want to compare the following conditions:

ID_2 and ID_3 should always be less than or equal to ID_1. If anyone of them is greater than ID_1 then that cell should be deleted and shifted with the next column cell

输出应如下所示:

    Value_1      ID_1      Value_2      ID_2         Value_3      ID_3
       A           1            D         1               G          1
       B           1            H         1           blank        nan
       C           1       blank        nan           blank        nan
       C           1            H         1           blank        nan

推荐答案

您可以按条件创建掩码,在这里可以通过

You can create mask by condition, here for greater values like ID_1 by DataFrame.gt::

cols1 = ['Value_2','Value_3']
cols2 = ['ID_2','ID_3']

m = df[cols2].gt(df['ID_1'], axis=0)
print (m)
    ID_2   ID_3
0  False  False
1   True  False
2   True   True
3   True  False

然后将匹配掩码替换为 DataFrame.mask :

Then replace missing values if match mask by DataFrame.mask:

df[cols2] = df[cols2].mask(m) 
df[cols1] = df[cols1].mask(m.to_numpy()) 

最后一次使用 DataFrame.shift ,并通过 :

df1 = df[cols2].shift(-1, axis=1)
df['ID_2'] =  df['ID_2'].mask(m['ID_2'], df1['ID_2'])
df['ID_3'] =  df['ID_3'].mask(m['ID_2'])

df2 = df[cols1].shift(-1, axis=1)
df['Value_2'] =  df['Value_2'].mask(m['ID_2'], df2['Value_2'])
df['Value_3'] =  df['Value_3'].mask(m['ID_2'])

print (df)
  Value_1  ID_1 Value_2  ID_2 Value_3  ID_3
0       A     1       D   1.0       G   1.0
1       B     1       H   1.0     NaN   NaN
2       C     1     NaN   NaN     NaN   NaN
3       C     1       H   1.0     NaN   NaN

最后用空字符串代替:

df[cols1] = df[cols1].fillna('')
print (df)
  Value_1  ID_1 Value_2  ID_2 Value_3  ID_3
0       A     1       D   1.0       G   1.0
1       B     1       H   1.0           NaN
2       C     1           NaN           NaN
3       C     1       H   1.0           NaN

这篇关于 pandas 在多个条件下删除和移动列中的单元格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆