在具有重复项的MultiIndex中删除具有NaN的行 [英] Removing rows with NaN in MultiIndex with duplicates

查看：232 发布时间：2020/5/13 18:34:58 python pandas dataframe nan multi-index

本文介绍了在具有重复项的MultiIndex中删除具有NaN的行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

已更新了可解决我确切问题的数据框

我有一个问题，出现在索引中的NaN导致行不唯一(自NaN !== NaN起).我需要删除索引中NaN出现的所有行.我之前的问题有一个带有单个NaN行的DataFrame示例，但是原始解决方案无法解决我的问题，因为它不满足广告要求不高的要求:

I have an issue where NaN appearing in my indexes is leading to non-unique rows (since NaN !== NaN). I need to drop all rows where NaN occurs in the index. My previous question had an example DataFrame with a single NaN row, however the original solution did not resolve my issue as it did not meet this poorly advertised requirement:

(请注意，在实际数据中，我有成千上万的此类行，包括自NaN !== NaN起的重复行，因此在索引上是允许的)

(Note that in the actual data I have thousands of such rows, including duplicate rows since NaN !== NaN so this is permissible on an index)

(摘自我的原始帖子)

>>>import pandas as pd
>>>import numpy as np
>>> df = pd.DataFrame([[1,1,"a"],[1,2,"b"],[1,3,"c"],[1,np.nan,"x"],[1,np.nan,"x"],[1,np.nan,"x"],[2,1,"d"],[2,2,"e"],[np.nan,1,"x"],[np.nan,2,"x"],[np.nan,1,"x"]], columns=["a","b","c"])
>>>df
         c
a   b
1.0 1.0  a
    2.0  b
    3.0  c
    NaN  x
    NaN  x
    NaN  x
2.0 1.0  d
    2.0  e
NaN 1.0  x
    2.0  x
    1.0  x

请注意重复的行:(1.0, NaN)和(NaN, 1.0)

我尝试了一些简单的方法，例如:

I've tried something simple like:

>>>df = df[pandas.notnull(df.index)]

但这失败了，因为未为MultiIndex实现notnull.

But this fails because notnull is not implemented for MultiIndex.

还有一个较早的答案建议:

Also one of the early answers suggested:

>>>df = df.reindex(df.index.dropna())

但是此操作失败并显示以下错误:

However this failed with the error:

Exception: cannot handle a non-unique multi-index!

所需的输出:

>>>df
         c
a   b
1.0 1.0  a
    2.0  b
    3.0  c
2.0 1.0  d
    2.0  e

(所有NaN索引行均被删除，从而消除了所有非唯一行)

(all NaN index rows are dropped, eliminating any non-unique rows)

推荐答案

选项1
reset_index ， dropna 和 set_index 再次.

Option 1
reset_index, dropna, and set_index once more.

c = df.index.names
df = df.reset_index().dropna().set_index(c)
df

         c
a   b     
1.0 1.0  a
    2.0  b
    3.0  c
2.0 1.0  d
    2.0  e
    2.0  x
    1.0  x

如果您的MultiIndex是唯一的，则可以使用...
选项2
df.index.dropna 和 df.reindex

If your MultiIndex is unique, you can use...
Option 2
df.index.dropna and df.reindex

df = df.reindex(df.index.dropna())

这篇关于在具有重复项的MultiIndex中删除具有NaN的行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在具有重复项的MultiIndex中删除具有NaN的行 [英] Removing rows with NaN in MultiIndex with duplicates

问题描述

所需的输出:

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在具有重复项的MultiIndex中删除具有NaN的行 [英] Removing rows with NaN in MultiIndex with duplicates

问题描述

所需的输出:

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭