在pandas数据框中搜索字符串列表,并将每个搜索字符串添加到新列中 [英] Search a list of strings in pandas dataframe and add each search string to a new column
本文介绍了在pandas数据框中搜索字符串列表,并将每个搜索字符串添加到新列中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个带有文本列说明"的数据框 我有一个搜索字符串列表,如:
I have a dataframe with a text column 'Description' And I have a list of search strings as:
search = ['FR-001', 'FR-002, 'FR-003', 'FR-004']
我想使用搜索列表中的字符串搜索数据框.我用过:
I want to search the dataframe using strings in the search list. I used:
df.loc[df['Description'].str.contains('|'.join(search), na=False)]
我得到了理想的结果,因此所有行都正确返回了.
I get the desired results such that all the rows are returned correctly.
如何将每个成功的搜索字符串添加到新数据框列"FR"中的匹配行?
How can I add each of successful search strings to the matching row in a new dataframe column 'FR'?
修改
描述"列的5行,预期结果"列为FR
5 rows of Description column with Expected result column FR
推荐答案
I think you need findall
:
具有@AndreyF的样本数据:
With sample data of @AndreyF:
search = ['FR-001', 'FR-002', 'FR-003', 'FR-004']
df['FR'] = df['Description'].str.findall('(' + '|'.join(search) + ')')
print (df)
Description FR
0 AasfasfFR-001,asfasdfafsagsdg FR-002 [FR-001, FR-002]
1 AasfasfFR-004, FR-002 [FR-004, FR-002]
2 AasfasfFR-02,asfasdfafsagsdg []
3 AasfasfFR-001,asfasdfafsagsdg FR-003 [FR-001, FR-003]
4 AasfasfFR-004,asfasdfafsagsdg FR-002 [FR-004, FR-002]
如果需要过滤出空列表:
And if need filter out empty lists:
df = df[df['FR'].astype(bool)]
print (df)
Description FR
0 AasfasfFR-001,asfasdfafsagsdg FR-002 [FR-001, FR-002]
1 AasfasfFR-004, FR-002 [FR-004, FR-002]
3 AasfasfFR-001,asfasdfafsagsdg FR-003 [FR-001, FR-003]
4 AasfasfFR-004,asfasdfafsagsdg FR-002 [FR-004, FR-002]
这篇关于在pandas数据框中搜索字符串列表,并将每个搜索字符串添加到新列中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文