在pandas数据框中搜索字符串列表,并将每个搜索字符串添加到新列中 [英] Search a list of strings in pandas dataframe and add each search string to a new column

查看:145
本文介绍了在pandas数据框中搜索字符串列表,并将每个搜索字符串添加到新列中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有文本列说明"的数据框 我有一个搜索字符串列表,如:

I have a dataframe with a text column 'Description' And I have a list of search strings as:

search = ['FR-001', 'FR-002, 'FR-003', 'FR-004']

我想使用搜索列表中的字符串搜索数据框.我用过:

I want to search the dataframe using strings in the search list. I used:

df.loc[df['Description'].str.contains('|'.join(search), na=False)]

我得到了理想的结果,因此所有行都正确返回了.

I get the desired results such that all the rows are returned correctly.

如何将每个成功的搜索字符串添加到新数据框列"FR"中的匹配行?

How can I add each of successful search strings to the matching row in a new dataframe column 'FR'?

修改

描述"列的5行,预期结果"列为FR

5 rows of Description column with Expected result column FR

示例数据框

推荐答案

我认为您需要

I think you need findall:

具有@AndreyF的样本数据:

With sample data of @AndreyF:

search = ['FR-001', 'FR-002', 'FR-003', 'FR-004']
df['FR'] = df['Description'].str.findall('(' + '|'.join(search) + ')')
print (df)

                            Description                FR
0  AasfasfFR-001,asfasdfafsagsdg FR-002  [FR-001, FR-002]
1                 AasfasfFR-004, FR-002  [FR-004, FR-002]
2         AasfasfFR-02,asfasdfafsagsdg                 []
3  AasfasfFR-001,asfasdfafsagsdg FR-003  [FR-001, FR-003]
4  AasfasfFR-004,asfasdfafsagsdg FR-002  [FR-004, FR-002]

如果需要过滤出空列表:

And if need filter out empty lists:

df = df[df['FR'].astype(bool)]
print (df)

                            Description                FR
0  AasfasfFR-001,asfasdfafsagsdg FR-002  [FR-001, FR-002]
1                 AasfasfFR-004, FR-002  [FR-004, FR-002]
3  AasfasfFR-001,asfasdfafsagsdg FR-003  [FR-001, FR-003]
4  AasfasfFR-004,asfasdfafsagsdg FR-002  [FR-004, FR-002]

这篇关于在pandas数据框中搜索字符串列表,并将每个搜索字符串添加到新列中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆