如何减少基于另一列的数据框colunm值 [英] How to reduce part of a dataframe colunm value based on another column

查看:78
本文介绍了如何减少基于另一列的数据框colunm值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个这样的数据框.

I have a dataframe like this.

我正在尝试删除出现在子字符串列中的字符串.

I am trying to remove the string which presents in substring column.

Main                     substring
Sri playnig well cricket cricket
sri went out             NaN
Ram is in                NaN
Ram went to UK,US        UK,US

我的期望值是

Main                     substring
Sri playnig well         cricket
sri went out             NaN
Ram is in                NaN
Ram went to              UK,US

我尝试了df["Main"].str.reduce(df["substring"]),但是没有用,请帮忙.

I tried df["Main"].str.reduce(df["substring"]) but not working, pls help.

推荐答案

这是使用pd.DataFrame.apply的一种方法.请注意,np.nan == np.nan的计算结果为False,我们可以在函数中使用此技巧来确定何时应用删除逻辑.

This is one way using pd.DataFrame.apply. Note that np.nan == np.nan evaluates to False, we can use this trick in our function to determine when to apply removal logic.

import pandas as pd, numpy as np

df = pd.DataFrame({'Main': ['Sri playnig well cricket', 'sri went out',
                            'Ram is in' ,'Ram went to UK,US'],
                   'substring': ['cricket', np.nan, np.nan, 'UK,US']})

def remover(row):
    sub = row['substring']
    if sub != sub:
        return row['Main']
    else:
        lst = row['Main'].split()
        return ' '.join([i for i in lst if i!=sub])

df['Main'] = df.apply(remover, axis=1)

print(df)

               Main substring
0  Sri playnig well   cricket
1      sri went out       NaN
2         Ram is in       NaN
3       Ram went to     UK,US

这篇关于如何减少基于另一列的数据框colunm值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆