根据 Pandas 中的字符串列表过滤出行 [英] Filter out rows based on list of strings in Pandas

查看:29
本文介绍了根据 Pandas 中的字符串列表过滤出行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大的时间序列数据框(称为 df),前 5 条记录如下所示:

I have a large time series data frame (called df), and the first 5 records look like this:

df

         stn     years_of_data  total_minutes avg_daily TOA_daily   K_daily
date                        
1900-01-14  AlberniElementary      4    5745    34.100  114.600 0.298
1900-01-14  AlberniWeather         6    7129    29.500  114.600 0.257
1900-01-14  Arbutus                8    11174   30.500  114.600 0.266
1900-01-14  Arrowview              7    10080   27.600  114.600 0.241
1900-01-14  Bayside                7    9745    33.800  114.600 0.295

目标:

我正在尝试删除列表中任何字符串的行出现在 'stn' 列中.因此,我基本上是在尝试过滤此数据集以不包含包含以下列表中任何字符串的行.

I am trying to remove rows where any of the strings in a list are present in the 'stn' column. So,I am basically trying to filter this dataset to not include rows containing any of the strings in following list.

尝试:

remove_list = ['Arbutus','Bayside']

cleaned = df[df['stn'].str.contains('remove_list')]

退货:

出[78]:

stn years_of_data   total_minutes   avg_daily   TOA_daily   K_daily
date    

什么都没有!

我尝试了几种引号、括号甚至 lambda 函数的组合;虽然我是新手,所以可能没有正确使用语法..

I have tried a few combinations of quotes, brackets, and even a lambda function; though I am fairly new, so probably not using syntax properly..

推荐答案

使用 isin:

cleaned = df[~df['stn'].isin(remove_list)]

In [7]:

remove_list = ['Arbutus','Bayside']
df[~df['stn'].isin(remove_list)]
Out[7]:
                          stn  years_of_data  total_minutes  avg_daily  
date                                                                     
1900-01-14  AlberniElementary              4           5745       34.1   
1900-01-14     AlberniWeather              6           7129       29.5   
1900-01-14          Arrowview              7          10080       27.6   

            TOA_daily  K_daily  
date                            
1900-01-14      114.6    0.298  
1900-01-14      114.6    0.257  
1900-01-14      114.6    0.241  

这篇关于根据 Pandas 中的字符串列表过滤出行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆