当数据帧中存在NaN时使用类型错误 [英] error using astype when NaN exists in a dataframe
问题描述
df
A B
0 a=10 b=20.10
1 a=20 NaN
2 NaN b=30.10
3 a=40 b=40.10
我尝试过:
df['A'] = df['A'].str.extract('(\d+)').astype(int)
df['B'] = df['B'].str.extract('(\d+)').astype(float)
但是出现以下错误:
ValueError:无法将float NaN转换为整数
ValueError: cannot convert float NaN to integer
并且:
AttributeError:只能将.str访问器与字符串值一起使用,后者在熊猫中使用np.object_ dtype
AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas
我该如何解决?
推荐答案
如果缺少列中的某些值(NaN
),然后将其转换为数字,则dtype
始终为float
.您不能将值转换为int
.仅限于float
,因为NaN
的type
是float
.
If some values in column are missing (NaN
) and then converted to numeric, always dtype
is float
. You cannot convert values to int
. Only to float
, because type
of NaN
is float
.
print (type(np.nan))
<class 'float'>
请参见 docs 如何在以下情况下转换值:至少一个NaN
:
See docs how convert values if at least one NaN
:
整数>转换为float64
integer > cast to float64
如果需要int值,则需要将NaN
替换为某些int
,例如0
通过 fillna
效果很好:
If need int values you need replace NaN
to some int
, e.g. 0
by fillna
and then it works perfectly:
df['A'] = df['A'].str.extract('(\d+)', expand=False)
df['B'] = df['B'].str.extract('(\d+)', expand=False)
print (df)
A B
0 10 20
1 20 NaN
2 NaN 30
3 40 40
df1 = df.fillna(0).astype(int)
print (df1)
A B
0 10 20
1 20 0
2 0 30
3 40 40
print (df1.dtypes)
A int32
B int32
dtype: object
这篇关于当数据帧中存在NaN时使用类型错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!