pandas 将 NaN 替换为 None 表现出违反直觉的行为 [英] pandas replace NaN to None exhibits counterintuitive behaviour
问题描述
给定一个系列
s = pd.Series([1.1, 1.2, np.nan])
s
0 1.1
1 1.2
2 NaN
dtype: float64
如果需要将 NaN 转换为 None(例如,使用镶木地板),那么我想要
If the need arises to convert the NaNs to None (to, for example, work with parquets), then I would like to have
0 1.1
1 1.2
2 None
dtype: object
我认为 Series.replace
将是这样做的明显方法,但这里是函数返回的内容:
I would assume Series.replace
would be the obvious way of doing this, but here's what the function returns:
s.replace(np.nan, None)
0 1.1
1 1.2
2 1.2
dtype: float64
NaN 被向前填充,而不是被替换.通过 docs,我看到如果第二个参数是 None,那么第一个参数应该是一个字典.基于此,我希望 replace
要么按预期替换,要么抛出异常.
The NaN was forward filled, instead of being replaced. Going through the docs, I see that if the second argument is None, then the first argument should be a dictionary. Based on this, I would expect replace
to either replace as intended, or throw an exception.
我相信这里的解决方法是
I believe the workaround here is
pd.Series([x if pd.notna(x) else None for x in s], dtype=object)
0 1.1
1 1.2
2 None
dtype: object
这很好.但我想了解为什么会发生这种行为,是否记录在案,或者只是一个错误,我必须清除我的 git 配置文件并在问题跟踪器上记录一个......有什么想法吗?
Which is fine. But I would like to understand why this behaviour occurs, whether it is documented, or if it is just a bug and I have to dust off my git profile and log one on the issue tracker... any ideas?
推荐答案
这是行为在 method
参数的文档中:
This is behaviour is in the documentation of the method
parameter:
method : {‘pad’, ‘ffill’, ‘bfill’, None}
The method to use when for replacement, when to_replace is a scalar, list or tuple and value is None.
所以在你的例子中 to_replace
是一个 scalar,而 value
是 None
.该方法默认为 pad
,来自 填充:
So in your example to_replace
is a scalar, and value
is None
. The method by default is pad
, from the documentation of fillna:
pad / ffill: propagate last valid observation forward to next valid
这篇关于 pandas 将 NaN 替换为 None 表现出违反直觉的行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!