Pandas/Numpy NaN 无比较 [英] Pandas/Numpy NaN None comparison
问题描述
在 Python Pandas 和 Numpy 中,为什么比较结果不同?
from pandas import 系列从 numpy 导入 NaN
NaN
不等于 NaN
但是NaN
在列表或元组中是
虽然 Series
与 NaN
又不相等:
和无
:
虽然
<预><代码>>>>系列([无])==系列([无])0 错误数据类型:布尔这个答案 解释了 NaN == NaN
一般为 False
的原因,但没有解释它在 python/pandas 集合中的行为.
如解释 此处 和此处和python文档检查序列相等性><块引用>
先比较元素标识,再比较元素仅对不同的元素执行.
因为 np.nan
和 np.NaN
指的是同一个对象,即 (np.nan is np.nan is np.NaN) == True
这个等式持有 [np.nan] == [np.nan]
,但另一方面 float('nan')
函数创建一个新对象在每次调用时,[float('nan')] == [float('nan')]
是 False
.
Pandas/Numpy 没有这个问题:
<预><代码>>>>pd.Series([np.NaN]).eq(pd.Series([np.NaN]))[0], (pd.Series([np.NaN]) == pd.Series([np.NaN]))[0](假的,假的)虽然特殊的 equals 方法对待 NaN
s 在相同的位置与 equals 相同.
None
被区别对待.numpy
认为它们相等:
虽然 pandas
没有
还有 ==
操作符和 eq
方法之间的不一致,讨论了 这里:
在 pandas: 0.23.4 numpy: 1.15.0
In Python Pandas and Numpy, why is the comparison result different?
from pandas import Series
from numpy import NaN
NaN
is not equal to NaN
>>> NaN == NaN
False
but NaN
inside a list or tuple is
>>> [NaN] == [NaN], (NaN,) == (NaN,)
(True, True)
While Series
with NaN
are not equal again:
>>> Series([NaN]) == Series([NaN])
0 False
dtype: bool
And None
:
>>> None == None, [None] == [None]
(True, True)
While
>>> Series([None]) == Series([None])
0 False
dtype: bool
This answer explains the reasons for NaN == NaN
being False
in general, but does not explain its behaviour in python/pandas collections.
As explained here, and here and in python docs to check sequence equality
element identity is compared first, and element comparison is performed only for distinct elements.
Because np.nan
and np.NaN
refer to the same object i.e. (np.nan is np.nan is np.NaN) == True
this equality holds [np.nan] == [np.nan]
, but on the other hand float('nan')
function creates a new object on every call so [float('nan')] == [float('nan')]
is False
.
Pandas/Numpy do not have this problem:
>>> pd.Series([np.NaN]).eq(pd.Series([np.NaN]))[0], (pd.Series([np.NaN]) == pd.Series([np.NaN]))[0]
(False, False)
Although special equals method treats NaN
s in the same location as equals.
>>> pd.Series([np.NaN]).equals(pd.Series([np.NaN]))
True
None
is treated differently. numpy
considers them equal:
>>> pd.Series([None, None]).values == (pd.Series([None, None])).values
array([ True, True])
While pandas
does not
>>> pd.Series([None, None]) == (pd.Series([None, None]))
0 False
1 False
dtype: bool
Also there is an inconsistency between ==
operator and eq
method, which is discussed here:
>>> pd.Series([None, None]).eq(pd.Series([None, None]))
0 True
1 True
dtype: bool
Tested on pandas: 0.23.4 numpy: 1.15.0
这篇关于Pandas/Numpy NaN 无比较的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!