从 Pandas 数据帧中删除 nan 并重塑数据帧 [英] Removing nan from pandas dataframe and reshaping dataframe

查看:77
本文介绍了从 Pandas 数据帧中删除 nan 并重塑数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 Pandas 数据框 df,如下所示:

I have a pandas dataframe df which looks as following:

    0    1    2   3    4     5    6
0   3    74                 
1   4    2                  
2            -9             
3                 -1   2    -16   -21
4             1             
5             28                

我想从上面删除所有 nan 并重新对齐每一行中的数据以获得以下内容:

I want to remove all the nan from the above and realign the data in each row to get the following:

    0   1    2   3
0   3   74      
1   4   2       
2   -9          
3   -1  2   -16  -21
4   1           
5   28  

基本上,我试图在删除 nan 后左对齐每行中的所有数据.我不知道如何继续.

Basically I am trying to left align all the data in each row after removing nan. I am not sure how to proceed with this.

推荐答案

首先通过 justify 移动所有非缺失值,然后使用 DataFrame.dropna 仅删除 NaNs 列:

First shift all non missing values by justify and then use DataFrame.dropna for remove only NaNs columns:

arr = justify(df.to_numpy(), invalid_val=np.nan)
df = pd.DataFrame(arr).dropna(axis=1, how='all')
print (df)
      0     1     2     3
0   3.0  74.0   NaN   NaN
1   4.0   2.0   NaN   NaN
2  -9.0   NaN   NaN   NaN
3  -1.0   2.0 -16.0 -21.0
4   1.0   NaN   NaN   NaN
5  28.0   NaN   NaN   NaN

<小时>

#https://stackoverflow.com/a/44559180/2901002
def justify(a, invalid_val=0, axis=1, side='left'):    
    """
    Justifies a 2D array

    Parameters
    ----------
    A : ndarray
        Input array to be justified
    axis : int
        Axis along which justification is to be made
    side : str
        Direction of justification. It could be 'left', 'right', 'up', 'down'
        It should be 'left' or 'right' for axis=1 and 'up' or 'down' for axis=0.

    """

    if invalid_val is np.nan:
        mask = ~np.isnan(a)
    else:
        mask = a!=invalid_val
    justified_mask = np.sort(mask,axis=axis)
    if (side=='up') | (side=='left'):
        justified_mask = np.flip(justified_mask,axis=axis)
    out = np.full(a.shape, invalid_val) 
    if axis==1:
        out[justified_mask] = a[mask]
    else:
        out.T[justified_mask.T] = a.T[mask.T]
    return out

这篇关于从 Pandas 数据帧中删除 nan 并重塑数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆