在 pandas 的数据框中查找非数字行? [英] Finding non-numeric rows in dataframe in pandas?

查看:84
本文介绍了在 pandas 的数据框中查找非数字行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 Pandas 中有一个大数据框,除了用作索引的列之外,它应该只有数值:

I have a large dataframe in pandas that apart from the column used as index is supposed to have only numeric values:

df = pd.DataFrame({'a': [1, 2, 3, 'bad', 5],
                   'b': [0.1, 0.2, 0.3, 0.4, 0.5],
                   'item': ['a', 'b', 'c', 'd', 'e']})
df = df.set_index('item')

如何找到数据帧 df 中包含非数字值的行?

How can I find the row of the dataframe df that has a non-numeric value in it?

在此示例中,它是数据帧中的第四行,在 a 列中有字符串 'bad'.如何以编程方式找到这一行?

In this example it's the fourth row in the dataframe, which has the string 'bad' in the a column. How can this row be found programmatically?

推荐答案

你可以使用 np.isreal 检查每个元素的类型 (applymap 将函数应用于 DataFrame 中的每个元素):

You could use np.isreal to check the type of each element (applymap applies a function to each element in the DataFrame):

In [11]: df.applymap(np.isreal)
Out[11]:
          a     b
item
a      True  True
b      True  True
c      True  True
d     False  True
e      True  True

如果行中的所有内容都为 True,则它们都是数字:

If all in the row are True then they are all numeric:

In [12]: df.applymap(np.isreal).all(1)
Out[12]:
item
a        True
b        True
c        True
d       False
e        True
dtype: bool

所以要获取rouges的subDataFrame,(注意:上面的否定,~,找到至少有一个rogue non-numeric的那些):

So to get the subDataFrame of rouges, (Note: the negation, ~, of the above finds the ones which have at least one rogue non-numeric):

In [13]: df[~df.applymap(np.isreal).all(1)]
Out[13]:
        a    b
item
d     bad  0.4

您还可以使用argmin:

You could also find the location of the first offender you could use argmin:

In [14]: np.argmin(df.applymap(np.isreal).all(1))
Out[14]: 'd'

正如 @CTZhu 指出的那样,检查它是否是 int 或 float 的实例(np.isreal 有一些额外的开销):>

As @CTZhu points out, it may be slightly faster to check whether it's an instance of either int or float (there is some additional overhead with np.isreal):

df.applymap(lambda x: isinstance(x, (int, float)))

这篇关于在 pandas 的数据框中查找非数字行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆