如何在数据列中的单词与值列表匹配,以及如何在python的pandas中应用ignorecase [英] how to match a word in a datacolumn with a list of values and applying ignorecase in pandas in python

查看:54
本文介绍了如何在数据列中的单词与值列表匹配,以及如何在python的pandas中应用ignorecase的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个df,

Name
Ram is one of the key ram
Kumar is playing cricket
Ravi is playing and ravi is a good player

和一个列表

my_list=["Ram","ravi"]

我想要的数据框是

desired_df,
Name                                        Match    Count 
Ram is one of the key ram                   Ram      1
Kumar is playing cricket                 
Ravi is playing and ravi is a good player   ravi     1   

我尝试过

 extracted = df.str.findall('(' + '|'.join(my_list) + ')', 
 flags=re.IGNORECASE).apply(set)
 but I am getting like,
 Match
 Ram,ram
 Ravi,ravi

但是我无法实现所需的输出,请帮忙.

but I cannot achieve my desired output, please help.

推荐答案

这是您要寻找的东西吗?

Is this what you are looking for ?

new_l = [i.lower() for i in my_list]
extracted = df['Name'].str.lower().str.findall('(' + '|'.join(new_l) + ')').apply(set)


df['Match'] = extracted.apply(','.join)
df['count'] = extracted.apply(len)


                                          Name     Match  count
0                      Ram is one of the key ram       ram      1
1                       Kumar is playing cricket                0
2  Ravi Ram is playing and ravi is a good player  ram,ravi      2

这篇关于如何在数据列中的单词与值列表匹配,以及如何在python的pandas中应用ignorecase的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆