过滤掉“空数组"; Pandas DataFrame中的值 [英] filter out "empty array" values in Pandas DataFrame

查看:89
本文介绍了过滤掉“空数组"; Pandas DataFrame中的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个数据框d,其中有一列包含Python数组作为值.

Suppose I have a dataframe, d which has a column containing Python arrays as the values.

>>> d = pd.DataFrame([['foo', ['bar']], ['biz', []]], columns=['a','b'])
>>> print d

     a      b
0  foo  [bar]
1  biz     []

现在,我想过滤掉那些包含空数组的行.

Now, I want to filter out those rows which have empty arrays.

我尝试过各种版本,但到目前为止还没有运气:

I have tried various versions, but no luck so far:

尝试将其检查为真实"值:

Trying to check it as a 'truthy' value:

>>> d[d['b']]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/frame.py", line 2682, in __getitem__
    return self._getitem_array(key)
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/frame.py", line 2726, in _getitem_array
    indexer = self.loc._convert_to_indexer(key, axis=1)
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/indexing.py", line 1314, in _convert_to_indexer
    indexer = check = labels.get_indexer(objarr)
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/indexes/base.py", line 3259, in get_indexer
    indexer = self._engine.get_indexer(target._ndarray_values)
  File "pandas/_libs/index.pyx", line 301, in pandas._libs.index.IndexEngine.get_indexer
  File "pandas/_libs/hashtable_class_helper.pxi", line 1544, in pandas._libs.hashtable.PyObjectHashTable.lookup
TypeError: unhashable type: 'list'

尝试进行显式长度检查.似乎将len()应用于序列,而不是数据值.

Trying an explicit length check. It seems len() is being applied to the series, not the value of the data.

>>> d[ len(d['b']) > 0 ]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/frame.py", line 2688, in __getitem__
    return self._getitem_column(key)
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/frame.py", line 2695, in _getitem_column
    return self._get_item_cache(key)
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/generic.py", line 2489, in _get_item_cache
    values = self._data.get(item)
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/internals.py", line 4115, in get
    loc = self.items.get_loc(item)
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/indexes/base.py", line 3080, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas/_libs/index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: True

直接比较空数组,就像我们可以比较一个空字符串(顺便说一句,如果我们使用字符串而不是数组,它确实可以工作).

Comparing to empty array directly, just as we might compare to an empty string (which, by the way, does work, if we use strings rather than arrays).

>>> d[ d['b'] == [] ]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/ops.py", line 1283, in wrapper
    res = na_op(values, other)
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/ops.py", line 1143, in na_op
    result = _comp_method_OBJECT_ARRAY(op, x, y)
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/ops.py", line 1120, in _comp_method_OBJECT_ARRAY
    result = libops.vec_compare(x, y, op)
  File "pandas/_libs/ops.pyx", line 128, in pandas._libs.ops.vec_compare
ValueError: Arrays were different lengths: 2 vs 0

推荐答案

使用字符串访问器.str检查熊猫系列中列表的长度:

Use the string accessor, .str to check the length of list in pandas series:

d[d.b.str.len()>0]

输出:

     a      b
0  foo  [bar]

这篇关于过滤掉“空数组"; Pandas DataFrame中的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆