用最大行数替换DataFrame中的Null [英] Replace Nulls in DataFrame with Max in Row
问题描述
有没有一种方法(比使用for循环更有效),将Pandas DataFrame中的所有null替换为其相应行中的最大值.
Is there a way (more efficient than using a for loop) to replace all the nulls in a Pandas' DataFrame with the max value in its respective row.
推荐答案
我想这就是您要寻找的:
I guess that is what you are looking for:
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 0], 'b': [3, 0, 10], 'c':[0, 5, 34]})
a b c
0 1 3 0
1 2 0 5
2 0 10 34
您可以使用apply
,遍历所有行,并使用replace
函数为行提供最大输出,将0替换为最大行数:
You can use apply
, iterate over all rows and replace 0 by the maximal number of the row by using the replace
function which gives you the expected output:
df.apply(lambda row: row.replace(0, max(row)), axis=1)
a b c
0 1 3 3
1 2 5 5
2 34 10 34
If you want to to replace NaN
- which seemed to be your actual goal according to your comment - you can use
df = pd.DataFrame({'a': [1, 2, np.nan], 'b': [3, np.nan, 10], 'c':[np.nan, 5, 34]})
a b c
0 1.0 3.0 NaN
1 2.0 NaN 5.0
2 NaN 10.0 34.0
df.T.fillna(df.max(axis=1)).T
屈服
a b c
0 1.0 3.0 3.0
1 2.0 5.0 5.0
2 34.0 10.0 34.0
比未完成计时更有效(未执行计时)
which might be more efficient (have not done the timings) than
df.apply(lambda row: row.fillna(row.max()), axis=1)
请注意
df.apply(lambda row: row.fillna(max(row)), axis=1)
在每种情况下均无法正常工作,如此处所述.
does not work in each case as explained here.
这篇关于用最大行数替换DataFrame中的Null的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!