在 Pandas Dataframe 中查找空或 NaN 条目 [英] Find empty or NaN entry in Pandas Dataframe

查看:52
本文介绍了在 Pandas Dataframe 中查找空或 NaN 条目的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试搜索 Pandas 数据框以查找缺少条目或 ​​NaN 条目的位置.

I am trying to search through a Pandas Dataframe to find where it has a missing entry or a NaN entry.

这是我正在使用的数据框:

Here is a dataframe that I am working with:

cl_id       a           c         d         e        A1              A2             A3
    0       1   -0.419279  0.843832 -0.530827    text76        1.537177      -0.271042
    1       2    0.581566  2.257544  0.440485    dafN_6        0.144228       2.362259
    2       3   -1.259333  1.074986  1.834653    system                       1.100353
    3       4   -1.279785  0.272977  0.197011     Fifty       -0.031721       1.434273
    4       5    0.578348  0.595515  0.553483   channel        0.640708       0.649132
    5       6   -1.549588 -0.198588  0.373476     audio       -0.508501               
    6       7    0.172863  1.874987  1.405923    Twenty             NaN            NaN
    7       8   -0.149630 -0.502117  0.315323  file_max             NaN            NaN

注意:空白条目是空字符串 - 这是因为数据帧来自的文件中没有字母数字内容.

NOTE: The blank entries are empty strings - this is because there was no alphanumeric content in the file that the dataframe came from.

如果我有这个数据框,我怎样才能找到一个包含 NaN 或空白条目出现的索引的列表?

If I have this dataframe, how can I find a list with the indexes where the NaN or blank entry occurs?

推荐答案

np.where(pd.isnull(df)) 返回值为 NaN 的行和列索引:

np.where(pd.isnull(df)) returns the row and column indices where the value is NaN:

In [152]: import numpy as np
In [153]: import pandas as pd
In [154]: np.where(pd.isnull(df))
Out[154]: (array([2, 5, 6, 6, 7, 7]), array([7, 7, 6, 7, 6, 7]))

In [155]: df.iloc[2,7]
Out[155]: nan

In [160]: [df.iloc[i,j] for i,j in zip(*np.where(pd.isnull(df)))]
Out[160]: [nan, nan, nan, nan, nan, nan]

可以使用 applymap 查找空字符串值:

Finding values which are empty strings could be done with applymap:

In [182]: np.where(df.applymap(lambda x: x == ''))
Out[182]: (array([5]), array([7]))

请注意,使用 applymap 需要为 DataFrame 的每个单元格调用一次 Python 函数.对于大型 DataFrame 来说,这可能会很慢,因此如果您可以安排所有空白单元格包含 NaN 会更好,这样您就可以使用 pd.isnull.

Note that using applymap requires calling a Python function once for each cell of the DataFrame. That could be slow for a large DataFrame, so it would be better if you could arrange for all the blank cells to contain NaN instead so you could use pd.isnull.

这篇关于在 Pandas Dataframe 中查找空或 NaN 条目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆