从基于另一列的列中删除子字符串 [英] Remove substring from column based on another column

查看:75
本文介绍了从基于另一列的列中删除子字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

尝试使用一列中的值(作为字符串)来确定要从另一列中删除的内容.列的其余部分必须保持不变.

Attempting to use the values (as string) from one column to determine what gets removed from another column. Remainder of the column must be unchanged.

示例数据:

import pandas as pd

dfTest = pd.DataFrame({
    'date': ['190225', '190225', '190226'],
    'foo': ['190225-file1_190225', '190225-file2_190225', '190226-file3_190226']
})

dfTest

结果数据框:

   |    date   |          foo
------------------------------------
0  |   190225  | 190225-file1_190225
1  |   190225  | 190225-file2_190225
2  |   190226  | 190226-file3_190226

我需要创建一个'bar'列,其中'foo'删除了所有'date'匹配项.

I need to create the 'bar' column where 'foo' has all 'date' matches removed.

我正在寻找的是这个

   |    date   |         foo          |   bar
-----------------------------------------------
0  |   190225  | 190225-file1_190225  | -file1_
1  |   190225  | 190225-file2_190225  | -file2_
2  |   190226  | 190226-file3_190226  | -file3_

日期"列的内容,无论它们出现在开头,中间还是结尾,都需要为"foo"的每一行删除.

The contents of the 'date' column, whether they appear in the beginning, middle, or end, need to be removed for each row of 'foo.'

我已经尝试了一些类似下面的代码的方法,但是它不起作用.它只是复制原始列而不替换任何内容.请注意,更改regex = False不会影响结果.

I have tried a few things like the code below, but it doesn't work. It just replicates the original column without replacing anything. Note that changing regex = False does not impact the results.

dfTest['bar'] = dfTest['foo'].str.replace(str(dfTest['date']), '')

#or (removing .str, gives same result):

#dfTest['bar'] = dfTest['foo'].replace(str(dfTest['date']), '')

这两个结果都在下表中(在"bar"中完全相同):

Both result in the below table (exactly the same in 'bar'):

   |    date   |         foo          |         bar
-----------------------------------------------------------
0  |   190225  | 190225-file1_190225  | 190225-file1_190225  
1  |   190225  | 190225-file2_190225  | 190225-file2_190225  
2  |   190226  | 190226-file3_190226  | 190226-file3_190226  

如何删除日期列的内容,但保留原始数据呢?

How can I remove the contents of the date column but otherwise preserve the original data?

推荐答案

所以,我尝试了一下,效果很好:

So, I tried this and it worked pretty well:

dfTest['bar'] = dfTest.apply(lambda row : row['foo'].replace(str(row['date']), ''), axis=1)

这篇关于从基于另一列的列中删除子字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆