在DataFrame中基于标签的安全选择 [英] Safe label-based selection in DataFrame

查看:62
本文介绍了在DataFrame中基于标签的安全选择的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何通过标签列表安全地选择熊猫中的行?
当列表包含任何不存在的标签时,我想获取Error.

How can I safely select rows in pandas by a list of labels?
I want to get and Error when list contains any non-existing label.

如果您要查询的标签中至少有1个在索引中,则方法loc不会引发KeyError.这还不够.

Method loc doesn't raise a KeyError if at least 1 of the labels for which you ask is in the index. But this is not sufficient.

例如:

df = pd.DataFrame(index=list('abcde'), data={'A': np.arange(5) + 10})

df
    A
a  10
b  11
c  12
d  13
e  14

# here I would like to get an Error as 'xx' and 'yy' are not in the index
df.loc[['b', 'xx', 'yy']] 

       A
b   11.0
xx   NaN
yy   NaN

熊猫提供的方法会引发KeyError而不是为我不存在的标签返回一堆 NaN 吗?

Do pandas provide such a method that would raise a KeyError instead of returning me a bunch of NaNs for non-existing labels?

推荐答案

虽然有点hack,但是可以这样:

It's bit a hack, but one can do this like this:

def my_loc(df, idx):
    assert len(df.index[df.index.isin(idx)]) == len(idx), 'KeyError:the labels [{}] are not in the [index]'.format(idx)
    return df.loc[idx]

In [243]: my_loc(df, idx)
...
skipped
...
AssertionError: KeyError:the labels [['b', 'xx', 'yy']] are not in the [index]

In [245]: my_loc(df, ['a','c','e'])
Out[245]:
    A
a  10
c  12
e  14

这篇关于在DataFrame中基于标签的安全选择的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆