pandas 从另一个系列中找到一个系列中的超级字符串 [英] Pandas find super string in one Series from another Series
问题描述
这不一定需要在熊猫中完成,但如果可以在熊猫中完成就好了.
This does not need to necessarily be done in pandas but it would be nice if it could be done in pandas.
假设我有一个列表或一系列字符串:
Say I have a list or Series of strings:
['XXY8779','0060-19','McChicken','456728']
我还有另一个列表或系列,其中包含原始的子字符串,如下所示:
And I have another list or Series which contains sub-strings of the original like so:
['60-19','Chicken','8779','1124231','92871','johnson']
这将返回如下内容:
[True, True, True, False]
我正在寻找类似于以下内容的匹配项:
I'm looking for a match that is something like:
^[a-zA-Z0-9.,$;]+ < matching string in other list >
换句话说,以 1 个或多个任意字符开头的内容,但其余部分与我的另一个列表中的一个字符串完全匹配.
So in other words, something that starts with 1 or more of any character but the rest matches exactly with one of the strings in my other list.
有没有人对实现这一目标的最佳方法有任何想法?
Does anyone have any ideas on the best way to accomplish this?
谢谢!
推荐答案
使用 str.contains
'|'.join(s2)
生成一个字符串,告诉 contains
使用 regex
并使用 or 逻辑.
Use str.contains
'|'.join(s2)
produces a string that tells contains
to use regex
and use or logic.
s1 = pd.Series(['XXY8779', '0060-19', 'McChicken', '456728'])
s2 = ['60-19', 'Chicken', '8779', '1124231', '92871', 'johnson']
s1.str.contains('|'.join(s2))
0 True
1 True
2 True
3 False
dtype: bool
这篇关于 pandas 从另一个系列中找到一个系列中的超级字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!