按NaN计数的降序对数据框的行进行排序 [英] Sort rows of a dataframe in descending order of NaN counts
本文介绍了按NaN计数的降序对数据框的行进行排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试对以下Pandas DataFrame进行排序:
I'm trying to sort the following Pandas DataFrame:
RHS age height shoe_size weight
0 weight NaN 0.0 0.0 1.0
1 shoe_size NaN 0.0 1.0 NaN
2 shoe_size 3.0 0.0 0.0 NaN
3 weight 3.0 0.0 0.0 1.0
4 age 3.0 0.0 0.0 1.0
以这样的方式,首先放置具有更多NaNs列的行. 更准确地说,在上述df中,索引为1(2 Nans)的行应位于索引为0(1 NaN)的行之前.
in such a way that the rows with a greater number of NaNs columns are positioned first. More precisely, in the above df, the row with index 1 (2 Nans) should come before ther row with index 0 (1 NaN).
我现在要做的是:
df.sort_values(by=['age', 'height', 'shoe_size', 'weight'], na_position="first")
推荐答案
使用基于df.sort_values
和loc
的访问.
df = df.iloc[df.isnull().sum(1).sort_values(ascending=0).index]
print(df)
RHS age height shoe_size weight
1 shoe_size NaN 0.0 1.0 NaN
2 shoe_size 3.0 0.0 0.0 NaN
0 weight NaN 0.0 0.0 1.0
4 age 3.0 0.0 0.0 1.0
3 weight 3.0 0.0 0.0 1.0
df.isnull().sum(1)
对NaN
进行计数,并基于此排序的计数访问行.
df.isnull().sum(1)
counts the NaN
s and the rows are accessed based on this sorted count.
@ayhan offered a nice little improvement to the solution above, involving pd.Series.argsort
:
df = df.iloc[df.isnull().sum(axis=1).mul(-1).argsort()]
print(df)
RHS age height shoe_size weight
1 shoe_size NaN 0.0 1.0 NaN
0 weight NaN 0.0 0.0 1.0
2 shoe_size 3.0 0.0 0.0 NaN
3 weight 3.0 0.0 0.0 1.0
4 age 3.0 0.0 0.0 1.0
这篇关于按NaN计数的降序对数据框的行进行排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文