ValueError:必须仅通过布尔值传递DataFrame [英] ValueError: Must pass DataFrame with boolean values only

查看:989
本文介绍了ValueError:必须仅通过布尔值传递DataFrame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题

在此数据文件中,使用区域列将美国分为四个区域。

In this datafile, the United States is broken up into four regions using the "REGION" column.

创建一个查询,查找属于区域1或2的县(名称以华盛顿开头,且POPESTIMATE2015大于其POPESTIMATE 2014)。

Create a query that finds the counties that belong to regions 1 or 2, whose name starts with 'Washington', and whose POPESTIMATE2015 was greater than their POPESTIMATE 2014.

此函数应返回一个5x2 DataFrame,其列= ['STNAME','CTYNAME']且索引ID与census_df相同(按索引升序排列) 。

CODE

    def answer_eight():
    counties=census_df[census_df['SUMLEV']==50]
    regions = counties[(counties[counties['REGION']==1]) | (counties[counties['REGION']==2])]
    washingtons = regions[regions[regions['COUNTY']].str.startswith("Washington")]
    grew = washingtons[washingtons[washingtons['POPESTIMATE2015']]>washingtons[washingtons['POPESTIMATES2014']]]
    return grew[grew['STNAME'],grew['COUNTY']]

outcome = answer_eight()
assert outcome.shape == (5,2)
assert list (outcome.columns)== ['STNAME','CTYNAME']
print(tabulate(outcome, headers=["index"]+list(outcome.columns),tablefmt="orgtbl"))

错误

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-77-546e58ae1c85> in <module>()
      6     return grew[grew['STNAME'],grew['COUNTY']]
      7 
----> 8 outcome = answer_eight()
      9 assert outcome.shape == (5,2)
     10 assert list (outcome.columns)== ['STNAME','CTYNAME']

<ipython-input-77-546e58ae1c85> in answer_eight()
      1 def answer_eight():
      2     counties=census_df[census_df['SUMLEV']==50]
----> 3     regions = counties[(counties[counties['REGION']==1]) | (counties[counties['REGION']==2])]
      4     washingtons = regions[regions[regions['COUNTY']].str.startswith("Washington")]
      5     grew = washingtons[washingtons[washingtons['POPESTIMATE2015']]>washingtons[washingtons['POPESTIMATES2014']]]

/opt/conda/lib/python3.5/site-packages/pandas/core/frame.py in __getitem__(self, key)
   1991             return self._getitem_array(key)
   1992         elif isinstance(key, DataFrame):
-> 1993             return self._getitem_frame(key)
   1994         elif is_mi_columns:
   1995             return self._getitem_multilevel(key)

/opt/conda/lib/python3.5/site-packages/pandas/core/frame.py in _getitem_frame(self, key)
   2066     def _getitem_frame(self, key):
   2067         if key.values.size and not com.is_bool_dtype(key.values):
-> 2068             raise ValueError('Must pass DataFrame with boolean values only')
   2069         return self.where(key)
   2070 

ValueError: Must pass DataFrame with boolean values only

我一无所知。我在哪里错了?

I am clueless. Where am I going wrong?

谢谢

推荐答案

尝试使用其他形状的df遮罩df,这是错误的,此外,您通过条件的方式使用不正确。当您将df中的列或序列与标量进行比较以生成布尔掩码时,您应该仅传递条件,而不要连续使用此条件。

You're trying to use a different shaped df to mask your df, this is wrong, additionally the way you're passing the conditions is being used incorrectly. When you compare a column or series in a df with a scalar to produce a boolean mask you should pass just the condition, not use this successively.

def answer_eight():
    counties=census_df[census_df['SUMLEV']==50]
    # this is wrong you're passing the df here multiple times
    regions = counties[(counties[counties['REGION']==1]) | (counties[counties['REGION']==2])]
    # here you're doing it again
    washingtons = regions[regions[regions['COUNTY']].str.startswith("Washington")]
    # here you're doing here again also
    grew = washingtons[washingtons[washingtons['POPESTIMATE2015']]>washingtons[washingtons['POPESTIMATES2014']]]
    return grew[grew['STNAME'],grew['COUNTY']]

您想要的:

def answer_eight():
    counties=census_df[census_df['SUMLEV']==50]
    regions = counties[(counties['REGION']==1]) | (counties['REGION']==2])]
    washingtons = regions[regions['COUNTY'].str.startswith("Washington")]
    grew = washingtons[washingtons['POPESTIMATE2015']>washingtons['POPESTIMATES2014']]
    return grew[['STNAME','COUNTY']]

这篇关于ValueError:必须仅通过布尔值传递DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆