如何在 pandas 数据框中将单元格设置为NaN [英] How to set a cell to NaN in a pandas dataframe
本文介绍了如何在 pandas 数据框中将单元格设置为NaN的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想用NaN替换数据框中某列中的错误值.
I'd like to replace bad values in a column of a dataframe by NaN's.
mydata = {'x' : [10, 50, 18, 32, 47, 20], 'y' : ['12', '11', 'N/A', '13', '15', 'N/A']}
df = pd.DataFrame(mydata)
df[df.y == 'N/A']['y'] = np.nan
但是,由于最后一行在df副本上起作用,因此最后一行失败并引发警告.那么,处理此问题的正确方法是什么?我已经见过许多使用iloc或ix的解决方案,但是在这里,我需要使用布尔条件.
Though, the last line fails and throws a warning because it's working on a copy of df. So, what's the correct way to handle this? I've seen many solutions with iloc or ix but here, I need to use a boolean condition.
推荐答案
只需使用replace
:
In [106]:
df.replace('N/A',np.NaN)
Out[106]:
x y
0 10 12
1 50 11
2 18 NaN
3 32 13
4 47 15
5 20 NaN
What you're trying is called chain indexing: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
您可以使用loc
来确保对原始dF进行操作:
You can use loc
to ensure you operate on the original dF:
In [108]:
df.loc[df['y'] == 'N/A','y'] = np.nan
df
Out[108]:
x y
0 10 12
1 50 11
2 18 NaN
3 32 13
4 47 15
5 20 NaN
这篇关于如何在 pandas 数据框中将单元格设置为NaN的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文