pandas 数据框选择行,其中列表列包含任何字符串列表 [英] Pandas dataframe select rows where a list-column contains any of a list of strings

查看:72
本文介绍了 pandas 数据框选择行,其中列表列包含任何字符串列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个熊猫DataFrame,看起来像这样:

I've got a pandas DataFrame that looks like this:

  molecule            species
0        a              [dog]
1        b       [horse, pig]
2        c         [cat, dog]
3        d  [cat, horse, pig]
4        e     [chicken, pig]

,我想提取一个DataFrame仅包含那些行,其中包含 selection = [ 'cat','dog'] 。因此结果应如下所示:

and I like to extract a DataFrame containing only thoses rows, that contain any of selection = ['cat', 'dog']. So the result should look like this:

  molecule            species
0        a              [dog]
1        c         [cat, dog]
2        d  [cat, horse, pig]

用于测试:

selection = ['cat', 'dog']
df = pd.DataFrame({'molecule': ['a','b','c','d','e'], 'species' : [['dog'], ['horse','pig'],['cat', 'dog'], ['cat','horse','pig'], ['chicken','pig']]})


推荐答案

IIUC重新创建您的df,然后使用 isin 任何应该比应用

IIUC Re-create your df then using isin with any should be faster than apply

df[pd.DataFrame(df.species.tolist()).isin(selection).any(1)]
Out[64]: 
  molecule            species
0        a              [dog]
2        c         [cat, dog]
3        d  [cat, horse, pig]

这篇关于 pandas 数据框选择行,其中列表列包含任何字符串列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆