无法将 nan 转换为 int(但没有 nan) [英] cannot convert nan to int (but there are no nans)
问题描述
我有一个包含一列浮点数的数据框,我想将其转换为 int:
I have a dataframe with a column of floats that I want to convert to int:
> df['VEHICLE_ID'].head()
0 8659366.0
1 8659368.0
2 8652175.0
3 8652174.0
4 8651488.0
理论上我应该可以使用:
In theory I should just be able to use:
> df['VEHICLE_ID'] = df['VEHICLE_ID'].astype(int)
但我明白了:
Output: ValueError: Cannot convert NA to integer
但我很确定这个系列中没有 NaN:
But I am pretty sure that there are no NaNs in this series:
> df['VEHICLE_ID'].fillna(999,inplace=True)
> df[df['VEHICLE_ID'] == 999]
> Output: Empty DataFrame
Columns: [VEHICLE_ID]
Index: []
怎么回事?
推荐答案
基本上错误是告诉你你的 NaN
值,我将说明为什么你的尝试没有揭示这一点:
Basically the error is telling you that you NaN
values and I will show why your attempts didn't reveal this:
In [7]:
# setup some data
df = pd.DataFrame({'a':[1.0, np.NaN, 3.0, 4.0]})
df
Out[7]:
a
0 1.0
1 NaN
2 3.0
3 4.0
现在尝试投射:
df['a'].astype(int)
这引发了:
ValueError: Cannot convert NA to integer
但后来你尝试了这样的事情:
but then you tried something like this:
In [5]:
for index, row in df['a'].iteritems():
if row == np.NaN:
print('index:', index, 'isnull')
this 什么也没打印,但是 NaN
不能像这样使用相等来求值,事实上它有一个特殊的属性,当与自身比较时它会返回 False
:>
this printed nothing, but NaN
cannot be evaluated like this using equality, in fact it has a special property that it will return False
when comparing against itself:
In [6]:
for index, row in df['a'].iteritems():
if row != row:
print('index:', index, 'isnull')
index: 1 isnull
现在它打印行,您应该使用 isnull
以提高可读性:
now it prints the row, you should use isnull
for readability:
In [9]:
for index, row in df['a'].iteritems():
if pd.isnull(row):
print('index:', index, 'isnull')
index: 1 isnull
那怎么办?我们可以删除行:df.dropna(subset='a')
,或者我们可以使用 fillna
替换:
So what to do? We can drop the rows: df.dropna(subset='a')
, or we can replace using fillna
:
In [8]:
df['a'].fillna(0).astype(int)
Out[8]:
0 1
1 0
2 3
3 4
Name: a, dtype: int32
这篇关于无法将 nan 转换为 int(但没有 nan)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!