Pandas筛选器函数返回了一个Series,但预期为标量布尔 [英] Pandas Filter function returned a Series, but expected a scalar bool

查看:251
本文介绍了Pandas筛选器函数返回了一个Series,但预期为标量布尔的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试对熊猫数据框使用过滤器,以过滤出所有与重复值匹配的行(需要在重复项时删除所有行,而不仅仅是第一行或最后一行).

I am attempting to use filter on a pandas dataframe to filter out all rows that match a duplicate value(need to remove ALL the rows when there are duplicates, not just the first or last).

这就是我在编辑器中可以使用的内容:

This is what I have that works in the editor :

df = df.groupby("student_id").filter(lambda x: x.count() == 1)

但是当我运行包含此代码的脚本时,会收到错误消息:

But when I run my script with this code in it I get the error:

TypeError:过滤器函数返回了一个Series,但预期为标量布尔值

TypeError: filter function returned a Series, but expected a scalar bool

在尝试应用过滤器之前,我正在通过串联两个其他帧来创建数据帧.

I am creating the dataframe by concatenating two other frames immediately before trying to apply the filter.

推荐答案

应为:

In [32]: grouped = df.groupby("student_id")

In [33]: grouped.filter(lambda x: x["student_id"].count()==1)

更新:

我不确定您提到的有关交互式控制台的问题.从技术上讲,在这种特定情况下(可能存在其他情况,例如diff env可能具有复杂的导入"功能),控制台(例如ipython)的行为应与其他环境(orig python env或某些环境)相同IDE嵌入了一个)

i'm not sure about the issue u mentioned regarding the interactive console. technically speaking in this particular case (there might be other situations such as the intricate "import" functionality in which diff env may behave differently), the console (such as ipython) should behave the same as other environment (orig python env, or some IDE embedded one)

了解pandas groupby的一种直观方法是将DataFrame.groupby()的返回obj视为数据框列表.因此,当您尝试使用过滤器在x上应用lambda函数时,x实际上是这些数据帧之一:

an intuitive way to understand the pandas groupby is to treat the return obj of DataFrame.groupby() as a list of dataframe. so when u try to using filter to apply the lambda function upon x, x is actually one of those dataframes:

In[25]: df = pd.DataFrame(data,columns=year)

In[26]: df

Out[26]: 
   2013  2014
0     0     1
1     2     3
2     4     5
3     6     7
4     0     1
5     2     3
6     4     5
7     6     7

In[27]: grouped = df.groupby(2013)

In[28]: grouped.count()

Out[28]: 
      2014
2013      
0        2
2        2
4        2
6        2

在此示例中,分组的obj中的第一个数据帧将是:

in this example, the first dataframe in the grouped obj would be:

In[33]: df1 = df.ix[[0,4]]

In[34]: df1

Out[33]: 
   2013  2014
0     0     1
4     0     1

这篇关于Pandas筛选器函数返回了一个Series,但预期为标量布尔的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆