pandas 在多列中搜索子字符串 [英] pandas search for substring over multiple columns
问题描述
我有一个这样的df
c_name f_name
0 abc abc12
1 xyz abc1
2 mnq mnq2
目标是在两列中找到一个子字符串,并知道它属于哪一列.优先选择c_name
,因为如果子字符串在两列中,那么c_name
都具有优先权例如:如果在上述数据框中搜索abc
,我应该以某种方式获取abc1
表示f_name
.
The goal is to find a substring across the two columns an know which column it belongs to. Preference should be to c_name
, as in if the substring is in both the columns then c_name
gets precedence For eg: if I search for abc
in the above dataframe I should somehow get row 0 abc
for c_name
and row 1 abc1
for f_name
.
为了解决这个问题,我从
df[df['c_name'].str.contains('abc', case=False)]
这将给我c_name
的结果.现在的问题是如何排除在f_name
上执行相同操作的结果所在的行.任何帮助是极大的赞赏!
To solve this I started with
df[df['c_name'].str.contains('abc', case=False)]
which will give me the results for c_name
. The question now is to how to exclude the rows where I already have the results from performing the same operation on f_name
. Any help is greatly appreciated!
推荐答案
import pandas as pd
row =[['abcx','abcy'],
['efg','abcz'],
['higj','UK']]
df= pd.DataFrame(row)
df.columns = ['c_name', 'f_name']
print df[df['c_name'].str.contains('abc', case=False)]
delta_df =df[~df['c_name'].str.contains('abc', case=False)]
print delta_df[delta_df['f_name'].str.contains('abc', case=False)]
输出
c_name f_name
0 abcx abcy
c_name f_name
1 efg abcz
这篇关于 pandas 在多列中搜索子字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!