如何基于另一列的NaN值在pandas数据框中设置值? [英] How set values in pandas dataframe based on NaN values of another column?
本文介绍了如何基于另一列的NaN值在pandas数据框中设置值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个名为df
的数据框,其原始形状为(4361, 15)
. agefm
列的某些值是NaN.看看:
I have dataframe named df
with original shape (4361, 15)
. Some of agefm
column`s values are NaN. Just look:
> df[df.agefm.isnull() == True].agefm.shape
(2282,)
然后我创建新列并将其所有值设置为0:
Then I create new column and set all its values to 0:
df['nevermarr'] = 0
所以我想将nevermarr
值设置为1,然后在该行中agefm
是Nan:
So I would like to set nevermarr
value to 1, then in that row agefm
is Nan:
df[df.agefm.isnull() == True].nevermarr = 1
什么都没改变:
> df['nevermarr'].sum()
0
我在做什么错了?
推荐答案
The best is use numpy.where
:
df['nevermarr'] = np.where(df.agefm.isnull(), 1, 0)
print (df)
agefm nevermarr
0 NaN 1
1 5.0 0
2 6.0 0
或使用 loc
, ==True
可以省略:
df.loc[df.agefm.isnull(), 'nevermarr'] = 1
或 mask
:
df['nevermarr'] = df.nevermarr.mask(df.agefm.isnull(), 1)
print (df)
agefm nevermarr
0 NaN 1
1 5.0 2
2 6.0 3
示例:
import pandas as pd
import numpy as np
df = pd.DataFrame({'nevermarr':[7,2,3],
'agefm':[np.nan,5,6]})
print (df)
agefm nevermarr
0 NaN 7
1 5.0 2
2 6.0 3
df.loc[df.agefm.isnull(), 'nevermarr'] = 1
print (df)
agefm nevermarr
0 NaN 1
1 5.0 2
2 6.0 3
这篇关于如何基于另一列的NaN值在pandas数据框中设置值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文