从Pandas数据框中选择包含某些值的行 [英] Select rows containing certain values from pandas dataframe

查看:754
本文介绍了从Pandas数据框中选择包含某些值的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个熊猫数据框,其条目均为字符串:

I have a pandas dataframe whose entries are all strings:

   A     B      C
1 apple  banana pear
2 pear   pear   apple
3 banana pear   pear
4 apple  apple  pear

等我想选择包含某个字符串(例如香蕉")的所有行.我不知道每次都会出现在哪一列.当然,我可以编写一个for循环并遍历所有行.但是,有没有更简单或更快速的方式来做到这一点?

etc. I want to select all the rows that contain a certain string, say, 'banana'. I don't know which column it will appear in each time. Of course, I can write a for loop and iterate over all rows. But is there an easier or faster way to do this?

推荐答案

使用NumPy,可以将其矢量化以搜索所需的任意数量的字符串,就像这样-

With NumPy, it could be vectorized to search for as many strings as you wish, like so -

def select_rows(df,search_strings):
    unq,IDs = np.unique(df,return_inverse=True)
    unqIDs = np.searchsorted(unq,search_strings)
    return df[((IDs.reshape(df.shape) == unqIDs[:,None,None]).any(-1)).all(0)]

样品运行-

In [393]: df
Out[393]: 
        A       B      C
0   apple  banana   pear
1    pear    pear  apple
2  banana    pear   pear
3   apple   apple   pear

In [394]: select_rows(df,['apple','banana'])
Out[394]: 
       A       B     C
0  apple  banana  pear

In [395]: select_rows(df,['apple','pear'])
Out[395]: 
       A       B      C
0  apple  banana   pear
1   pear    pear  apple
3  apple   apple   pear

In [396]: select_rows(df,['apple','banana','pear'])
Out[396]: 
       A       B     C
0  apple  banana  pear

这篇关于从Pandas数据框中选择包含某些值的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆