Python - 从列表中搜索数据框中的字符串 [英] Python - Searching a string within a dataframe from a list

查看:71
本文介绍了Python - 从列表中搜索数据框中的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下列表:

search_list = ['STEEL','IRON','GOLD','SILVER']

我需要在数据框 (df) 中搜索:

which I need to search within a dataframe (df):

      a    b             
0    123   'Blah Blah Steel'
1    456   'Blah Blah Blah'
2    789   'Blah Blah Gold'

并将匹配的行插入到新的数据框 (newdf) 中,添加一个包含列表中匹配单词的新列:

and insert the matching rows into a new dataframe (newdf), adding a new column with the matching word from the list:

      a    b                   c
0    123   'Blah Blah Steel'   'STEEL'
1    789   'Blah Blah Gold'    'GOLD'

我可以使用以下代码提取匹配的行:

I can use the following code to extract the matching row:

newdf=df[df['b'].str.upper().str.contains('|'.join(search_list),na=False)]

但我不知道如何将列表中的匹配词添加到 c 列中.

but I can't figure out how to add the matching word from the list into column c.

我认为匹配以某种方式需要捕获列表中匹配单词的索引,然后使用索引号提取值,但我不知道如何执行此操作.

I'm thinking that the match somehow needs to capture the index of the matching word in the list and then pull the value using the index number but I can't figure out how to do this.

任何帮助或指示将不胜感激

Any help or pointers would be greatly appreciated

谢谢

推荐答案

你可以使用 extract 并过滤掉那些nan(即不匹配):

You could use extract and filter out those that are nan (i.e. no match):

search_list = ['STEEL','IRON','GOLD','SILVER']

df['c'] = df.b.str.extract('({0})'.format('|'.join(search_list)), flags=re.IGNORECASE)
result = df[~pd.isna(df.c)]

print(result)

输出

              a       b      c
123 'Blah  Blah  Steel'  Steel
789 'Blah  Blah   Gold'   Gold

请注意,您必须导入 re 模块才能使用 re.IGNORECASE 标志.作为替代方案,您可以直接使用 2,即 re.IGNORECASE 标志的值.

Note that you have to import the re module in order to use the re.IGNORECASE flag. As an alternative you could use 2 directly that is the value of the re.IGNORECASE flag.

更新

如@user3483203 所述,您可以使用以下方法保存导入:

As mentioned by @user3483203 you can save the import by using:

df['c'] = df.b.str.extract('(?i)({0})'.format('|'.join(search_list)))

这篇关于Python - 从列表中搜索数据框中的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆