在 pandas 数据框中跨行获取最后一个非na值 [英] Getting last non na value across rows in a pandas dataframe
本文介绍了在 pandas 数据框中跨行获取最后一个非na值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个形状为(40,500)的数据框.数据框中的每一行都有一些数值,直到有一些可变的列号k为止,此后的所有条目都是nan.
I have a dataframe of shape (40,500). Each row in the dataframe has some numerical values till some variable column number k, and all the entries after that are nan.
我正在尝试获取每行中最后一个非Nan列的值.有没有一种方法可以不循环遍历数据帧的所有行?
I am trying to get the value of last non-nan column in each row. Is there a way to do this without looping through all the rows of the dataframe?
示例数据框:
2016-06-02 7.080 7.079 7.079 7.079 7.079 7.079 nan nan nan
2016-06-08 7.053 7.053 7.053 7.053 7.053 7.054 nan nan nan
2016-06-09 7.061 7.061 7.060 7.060 7.060 7.060 nan nan nan
2016-06-14 nan nan nan nan nan nan nan nan nan
2016-06-15 7.066 7.066 7.066 7.066 nan nan nan nan nan
2016-06-16 7.067 7.067 7.067 7.067 7.067 7.067 7.068 7.068 nan
2016-06-21 7.053 7.053 7.052 nan nan nan nan nan nan
2016-06-22 7.049 7.049 nan nan nan nan nan nan nan
2016-06-28 7.058 7.058 7.059 7.059 7.059 7.059 7.059 7.059 7.059
要求输出
2016-06-02 7.079
2016-06-08 7.054
2016-06-09 7.060
2016-06-14 nan
2016-06-15 7.066
2016-06-16 7.068
2016-06-21 7.052
2016-06-22 7.049
2016-06-28 7.059
推荐答案
您需要 last_valid_index
具有自定义功能,因为如果所有值均为NaN
,它将返回KeyError
:
def f(x):
if x.last_valid_index() is None:
return np.nan
else:
return x[x.last_valid_index()]
df['status'] = df.apply(f, axis=1)
print (df)
1 2 3 4 5 6 7 8 9 \
0
2016-06-02 7.080 7.079 7.079 7.079 7.079 7.079 NaN NaN NaN
2016-06-08 7.053 7.053 7.053 7.053 7.053 7.054 NaN NaN NaN
2016-06-09 7.061 7.061 7.060 7.060 7.060 7.060 NaN NaN NaN
2016-06-14 NaN NaN NaN NaN NaN NaN NaN NaN NaN
2016-06-15 7.066 7.066 7.066 7.066 NaN NaN NaN NaN NaN
2016-06-16 7.067 7.067 7.067 7.067 7.067 7.067 7.068 7.068 NaN
2016-06-21 7.053 7.053 7.052 NaN NaN NaN NaN NaN NaN
2016-06-22 7.049 7.049 NaN NaN NaN NaN NaN NaN NaN
2016-06-28 7.058 7.058 7.059 7.059 7.059 7.059 7.059 7.059 7.059
status
0
2016-06-02 7.079
2016-06-08 7.054
2016-06-09 7.060
2016-06-14 NaN
2016-06-15 7.066
2016-06-16 7.068
2016-06-21 7.052
2016-06-22 7.049
2016-06-28 7.059
替代解决方案- fillna
使用方法ffill
并通过 iloc
选择最后一列:
Alternative solution - fillna
with method ffill
and select last column by iloc
:
df['status'] = df.ffill(axis=1).iloc[:, -1]
print (df)
status
0
2016-06-02 7.079
2016-06-08 7.054
2016-06-09 7.060
2016-06-14 NaN
2016-06-15 7.066
2016-06-16 7.068
2016-06-21 7.052
2016-06-22 7.049
2016-06-28 7.059
这篇关于在 pandas 数据框中跨行获取最后一个非na值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文