修改跳过空列表并继续执行功能 [英] modification of skipping empty list and continuing with function

查看:114
本文介绍了修改跳过空列表并继续执行功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

背景

以下代码与略过空列表并继续使用功能

import pandas as pd
Names =    [list(['Jon', 'Smith', 'jon', 'John']),
               list([]),
               list(['Bob', 'bobby', 'Bobs']),
               list([]),
               list([])]
df = pd.DataFrame({'Text' : ['Jon J Smith is Here and jon John from ', 
                                       'get nothing from here', 
                                       'I like Bob and bobby and also Bobs diner ',
                                        'nothing here too',
                                        'same here'
                            ], 

                          'P_ID': [1,2,3, 4,5], 
                          'P_Name' : Names

                         })

    #rearrange columns
df = df[['Text', 'P_ID', 'P_Name']]
df

                                 Text         P_ID  P_Name
0   Jon J Smith is Here and jon John from       1   [Jon, Smith, jon, John]
1   get nothing from here                       2   []
2   I like Bob and bobby and also Bobs diner    3   [Bob, bobby, Bobs]
3   nothing here too                            4   []
4   same here                                   5   []

工作代码

以下代码工作摘录自跳过空列表并继续执行功能

m = df['P_Name'].str.len().ne(0)
df.loc[m, 'New'] = df.loc[m, 'Text'].replace(df.loc[m].P_Name,'**BLOCK**',regex=True) 

并在df

            Text   P_ID  P_Name   New
0                                 **BLOCK** J **BLOCK** is Here and **BLOCK** **BLOCK** ...
1                                 NaN
2                                 I like **BLOCK** and **BLOCK** and also **BLOCK** d..
3                                 NaN 
4                                 NaN

所需的输出

但是,我想保留原始文本,例如,而不是行134中的NaN. get nothing from here如下所示

However, instead of NaN in row 1, 3, 4, I would like to keep the original text e.g. get nothing from here as seen below

            Text   P_ID  P_Name   New
0                                 **BLOCK** J **BLOCK** is Here and **BLOCK** **BLOCK** ...
1                                 get nothing from here
2                                 I like **BLOCK** and **BLOCK** and also **BLOCK** d..
3                                 nothing here too 
4                                 same here

问题

如何调整下面的代码以实现所需的输出?

How do I tweak the code below to achieve my desired output?

m = df['P_Name'].str.len().ne(0)
df.loc[m, 'New'] = df.loc[m, 'Text'].replace(df.loc[m].P_Name,'**BLOCK**',regex=True)  

推荐答案

@tawab_shakeel已关闭.只需添加:

@tawab_shakeel is close. Just add:

df['New'].fillna(df['Text'], inplace=True)

fillna将从df['Text']捕获正确的值.

我还可以使用正则表达式的 re 模块来提出替代解决方案

I can also propose an alternative solution using the re module for regex.

def replacing(x):
    if len(x['P_Name']) > 0:
        return re.sub('|'.join(x['P_Name']), '**BLOCK**', x['Text'])
    else:
        return x['Text']

df['New'] = df.apply(replacing, axis=1)

apply方法将replacing函数应用于每一行,并通过

The apply method applies the replacing function to each row, and substitution is done by the re.sub function.

这篇关于修改跳过空列表并继续执行功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆