检查 Pandas DataFrame 列中的字符串是否在字符串列表中 [英] Check if a string in a Pandas DataFrame column is in a list of strings

查看:83
本文介绍了检查 Pandas DataFrame 列中的字符串是否在字符串列表中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我有一个这样的框架

If I have a frame like this

frame = pd.DataFrame({'a' : ['the cat is blue', 'the sky is green', 'the dog is black']})

我想检查这些行中是否有任何一行包含某个单词,我只需要这样做.

and I want to check if any of those rows contain a certain word I just have to do this.

frame['b'] = frame.a.str.contains("dog") | frame.a.str.contains("cat") | frame.a.str.contains("fish")

frame['b'] 输出:

True
False
True

如果我决定列一个清单

mylist =['dog', 'cat', 'fish']

我将如何检查行中是否包含列表中的某个单词?

how would I check that the rows contain a certain word in the list?

推荐答案

frame = pd.DataFrame({'a' : ['the cat is blue', 'the sky is green', 'the dog is black']})

frame
                  a
0   the cat is blue
1  the sky is green
2  the dog is black

str.contains 方法接受一个正则表达式模式:

The str.contains method accepts a regular expression pattern:

mylist = ['dog', 'cat', 'fish']
pattern = '|'.join(mylist)

pattern
'dog|cat|fish'

frame.a.str.contains(pattern)
0     True
1    False
2     True
Name: a, dtype: bool

因为支持正则表达式模式,您还可以嵌入标志:

Because regex patterns are supported, you can also embed flags:

frame = pd.DataFrame({'a' : ['Cat Mr. Nibbles is blue', 'the sky is green', 'the dog is black']})

frame
                     a
0  Cat Mr. Nibbles is blue
1         the sky is green
2         the dog is black

pattern = '|'.join([f'(?i){animal}' for animal in mylist])  # python 3.6+

pattern
'(?i)dog|(?i)cat|(?i)fish'
 
frame.a.str.contains(pattern)
0     True  # Because of the (?i) flag, 'Cat' is also matched to 'cat'
1    False
2     True

这篇关于检查 Pandas DataFrame 列中的字符串是否在字符串列表中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆