在 pandas 数据框中跨行获取最后一个非na值 [英] Getting last non na value across rows in a pandas dataframe

查看：49 发布时间：2020/5/13 2:34:05 python pandas multidimensional-array dataframe na

本文介绍了在 pandas 数据框中跨行获取最后一个非na值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个形状为(40,500)的数据框.数据框中的每一行都有一些数值，直到有一些可变的列号k为止，此后的所有条目都是nan.

I have a dataframe of shape (40,500). Each row in the dataframe has some numerical values till some variable column number k, and all the entries after that are nan.

我正在尝试获取每行中最后一个非Nan列的值.有没有一种方法可以不循环遍历数据帧的所有行?

I am trying to get the value of last non-nan column in each row. Is there a way to do this without looping through all the rows of the dataframe?

示例数据框:

2016-06-02 7.080 7.079 7.079 7.079 7.079 7.079   nan   nan   nan
2016-06-08 7.053 7.053 7.053 7.053 7.053 7.054   nan   nan   nan  
2016-06-09 7.061 7.061 7.060 7.060 7.060 7.060   nan   nan   nan   
2016-06-14   nan   nan   nan   nan   nan   nan   nan   nan   nan  
2016-06-15 7.066 7.066 7.066 7.066   nan   nan   nan   nan   nan  
2016-06-16 7.067 7.067 7.067 7.067 7.067 7.067 7.068 7.068   nan  
2016-06-21 7.053 7.053 7.052   nan   nan   nan   nan   nan   nan  
2016-06-22 7.049 7.049   nan   nan   nan   nan   nan   nan   nan  
2016-06-28 7.058 7.058 7.059 7.059 7.059 7.059 7.059 7.059 7.059

要求输出

2016-06-02 7.079 
2016-06-08 7.054
2016-06-09 7.060
2016-06-14   nan 
2016-06-15 7.066
2016-06-16 7.068 
2016-06-21 7.052 
2016-06-22 7.049
2016-06-28 7.059

推荐答案

您需要 last_valid_index 具有自定义功能，因为如果所有值均为NaN，它将返回KeyError:

def f(x):
    if x.last_valid_index() is None:
        return np.nan
    else:
        return x[x.last_valid_index()]

df['status'] = df.apply(f, axis=1)
print (df)
                1      2      3      4      5      6      7      8      9  \
0                                                                           
2016-06-02  7.080  7.079  7.079  7.079  7.079  7.079    NaN    NaN    NaN   
2016-06-08  7.053  7.053  7.053  7.053  7.053  7.054    NaN    NaN    NaN   
2016-06-09  7.061  7.061  7.060  7.060  7.060  7.060    NaN    NaN    NaN   
2016-06-14    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN   
2016-06-15  7.066  7.066  7.066  7.066    NaN    NaN    NaN    NaN    NaN   
2016-06-16  7.067  7.067  7.067  7.067  7.067  7.067  7.068  7.068    NaN   
2016-06-21  7.053  7.053  7.052    NaN    NaN    NaN    NaN    NaN    NaN   
2016-06-22  7.049  7.049    NaN    NaN    NaN    NaN    NaN    NaN    NaN   
2016-06-28  7.058  7.058  7.059  7.059  7.059  7.059  7.059  7.059  7.059   

            status  
0                   
2016-06-02   7.079  
2016-06-08   7.054  
2016-06-09   7.060  
2016-06-14     NaN  
2016-06-15   7.066  
2016-06-16   7.068  
2016-06-21   7.052  
2016-06-22   7.049  
2016-06-28   7.059

替代解决方案- fillna 使用方法ffill并通过 iloc选择最后一列:

Alternative solution - fillna with method ffill and select last column by iloc:

df['status'] = df.ffill(axis=1).iloc[:, -1]
print (df)
            status  
0                   
2016-06-02   7.079  
2016-06-08   7.054  
2016-06-09   7.060  
2016-06-14     NaN  
2016-06-15   7.066  
2016-06-16   7.068  
2016-06-21   7.052  
2016-06-22   7.049  
2016-06-28   7.059

这篇关于在 pandas 数据框中跨行获取最后一个非na值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在 pandas 数据框中跨行获取最后一个非na值 [英] Getting last non na value across rows in a pandas dataframe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在 pandas 数据框中跨行获取最后一个非na值 [英] Getting last non na value across rows in a pandas dataframe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭