pandas 在数据框上的比较 [英] Pandas boolean comparisson on dataframe

查看:60
本文介绍了 pandas 在数据框上的比较的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在对数据框中的单个元素进行比较时出现错误,但是我不明白为什么.

I am getting the error when I make a comparison on a single element in a dataframe, but I don't understand why.

我有一个数据框df,其中包含许多客户的时间序列数据,其中有一些空值:

I have a dataframe df with timeseries data for a number of customers, with some null values within it:

df.head()
                    8143511  8145987  8145997  8146001  8146235  8147611  \
2012-07-01 00:00:00      NaN      NaN      NaN      NaN      NaN      NaN   
2012-07-01 00:30:00    0.089      NaN    0.281    0.126    0.190    0.500   
2012-07-01 01:00:00    0.090      NaN    0.323    0.141    0.135    0.453   
2012-07-01 01:30:00    0.061      NaN    0.278    0.097    0.093    0.424   
2012-07-01 02:00:00    0.052      NaN    0.278    0.158    0.170    0.462  

在我的脚本中,该行 if pd.isnull(df[[customer_ID]].loc[ts]): 产生错误:

In my script, the line if pd.isnull(df[[customer_ID]].loc[ts]): generates an error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

但是,如果我在脚本行上放置一个断点,并且在脚本停止时我将其输入到控制台中:

However, if I put a breakpoint on the line of script, and when the script stops I type this into the console:

pd.isnull(df[[customer_ID]].loc[ts])

输出为:

8143511    True
Name: 2012-07-01 00:00:00, dtype: bool

如果我允许脚本从这一点继续执行,则会立即生成错误.

If I allow the script to continue from that point, the error is generated immediately.

如果布尔表达式可以求值并且具有值True,为什么它在if表达式中生成错误?这对我来说毫无意义.

If the boolean expression can be evaluated and has the value True, why does it generate an error in the if expression? This makes no sense to me.

推荐答案

第二组[]返回的序列是我误认为单个值的序列.最简单的解决方案是删除[]:

The second set of [] was returning a series which I mistook for a single value. The simplest solution is to remove []:

if pd.isnull(df[customer_ID].loc[ts]):
       pass

这篇关于 pandas 在数据框上的比较的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆