pandas -检查字符串列是否包含一对字符串 [英] Pandas - check if a string column contains a pair of strings
本文介绍了 pandas -检查字符串列是否包含一对字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
假设我有一个像这样的DataFrame:
Let's say I have a DataFrame like this:
df = pd.DataFrame({'consumption':['squirrel eats apple', 'monkey eats apple',
'monkey eats banana', 'badger eats banana'],
'food':['apple', 'apple', 'banana', 'banana'],
'creature':['squirrel', 'badger', 'monkey', 'elephant']})
consumption creature food
0 squirrel eats apple squirrel apple
1 monkey eats apple badger apple
2 monkey eats banana monkey banana
3 badger eats banana elephant banana
我想找到生物"和"食物"在消费"列中同时出现,即如果苹果和松鼠同时出现,则为True,但如果苹果与Elephant一起出现,则为False.同样,如果Monkey&香蕉一起出现,然后为True,但Monkey-Apple将为假.
I want to find rows where the 'creature' & 'food' occur in combination in the 'consumption' column i.e. if apple and squirrel occure together, then True but if Apple occur with Elephant it's False. Similarly, if Monkey & Banana occur together, then True, but Monkey-Apple would be false.
我尝试的方法是:
creature_list = list(df['creature'])
creature_list = '|'.join(map(str, creature_list))
food_list = list(df['food'])
food_list = '|'.join(map(str, food_list))
np.where((df['consumption'].str.contains('('+creature_list+')', case = False))
& (df['consumption'].str.contains('('+food_list+')', case = False)), 1, 0)
但这无法正常工作,因为我在所有情况下都为True.
But this doesn't work since I get True in all instances.
如何检查字符串对?
推荐答案
这里是一种可能的方法:
Here's one possible way:
def match_consumption(r):
if (r['creature'] in r['consumption']) and (r['food'] in r['consumption']):
return True
else:
return False
df['match'] = df.apply(match_consumption, axis=1)
df
consumption creature food match
0 squirrel eats apple squirrel apple True
1 monkey eats apple badger apple False
2 monkey eats banana monkey banana True
3 badger eats banana elephant banana False
这篇关于 pandas -检查字符串列是否包含一对字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文