按NaN计数的降序对数据框的行进行排序 [英] Sort rows of a dataframe in descending order of NaN counts

查看:136
本文介绍了按NaN计数的降序对数据框的行进行排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试对以下Pandas DataFrame进行排序:

I'm trying to sort the following Pandas DataFrame:

         RHS  age  height  shoe_size  weight
0     weight  NaN     0.0        0.0     1.0
1  shoe_size  NaN     0.0        1.0     NaN
2  shoe_size  3.0     0.0        0.0     NaN
3     weight  3.0     0.0        0.0     1.0
4        age  3.0     0.0        0.0     1.0

以这样的方式,首先放置具有更多NaNs列的行. 更准确地说,在上述df中,索引为1(2 Nans)的行应位于索引为0(1 NaN)的行之前.

in such a way that the rows with a greater number of NaNs columns are positioned first. More precisely, in the above df, the row with index 1 (2 Nans) should come before ther row with index 0 (1 NaN).

我现在要做的是:

df.sort_values(by=['age', 'height', 'shoe_size', 'weight'], na_position="first")

推荐答案

使用基于df.sort_valuesloc的访问.

df = df.iloc[df.isnull().sum(1).sort_values(ascending=0).index]
print(df)

         RHS  age  height  shoe_size  weight
1  shoe_size  NaN     0.0        1.0     NaN
2  shoe_size  3.0     0.0        0.0     NaN
0     weight  NaN     0.0        0.0     1.0
4        age  3.0     0.0        0.0     1.0
3     weight  3.0     0.0        0.0     1.0

df.isnull().sum(1)NaN进行计数,并基于此排序的计数访问行.

df.isnull().sum(1) counts the NaNs and the rows are accessed based on this sorted count.

@ayhan提供了

@ayhan offered a nice little improvement to the solution above, involving pd.Series.argsort:

df = df.iloc[df.isnull().sum(axis=1).mul(-1).argsort()]
print(df)

         RHS  age  height  shoe_size  weight 
1  shoe_size  NaN     0.0        1.0     NaN           
0     weight  NaN     0.0        0.0     1.0           
2  shoe_size  3.0     0.0        0.0     NaN           
3     weight  3.0     0.0        0.0     1.0           
4        age  3.0     0.0        0.0     1.0            

这篇关于按NaN计数的降序对数据框的行进行排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆