向pandas DataFrame添加行会更改dtype [英] Adding row to pandas DataFrame changes dtype
问题描述
我遇到的问题是,向DataFrame添加一行会更改列的dtype:
The problem I have is that adding a row to DataFrame changes dtype of columns:
>>> from pandas import DataFrame
>>> df = DataFrame({'a' : range(10)}, dtype='i4')
>>> df
a
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
[10 rows x 1 columns]
我特别指定dtype为int32(即'i4'),如所示:
I specifically specified dtype to be int32 (i.e., 'i4'), as can be seen:
>>> df.dtypes
a int32
dtype: object
但是,添加一行会将dtype更改为float64:
However, adding a row changes dtype to float64:
>>> df.loc[10] = 99
>>> df
a
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 99
[11 rows x 1 columns]
>>> df.dtypes
a float64
dtype: object
我尝试指定添加的值的dtype:
I've tried specifying the dtype of the value that I add:
>>> import numpy as np
>>> df = DataFrame({'a' : np.arange(10, dtype=np.int32)})
>>> df.dtypes
a int32
dtype: object
>>> df.loc[10] = np.int32(0)
>>> df.dtypes
a float64
dtype: object
但这也不起作用.有没有不使用返回新对象的函数的解决方案?
But that does not work either. Is there any solution, without using functions that return new objects?
推荐答案
放大分为两个阶段,首先在该列中放置一个nan
,然后对其进行分配,因此这是强制执行的原因.我将其放在错误/增强列表中.这有点不平凡.
Enlargment is done in 2 stages, and a nan
is placed in that column first, then its assigned, so that is why it is coerced. I'll put it on the bug/enhancement list. Its a bit non-trivial.
这是一种解决方法,方法是使用附加操作.
Here's a workaround, by using append.
In [14]: df.append(Series(99,[10],dtype='i4').to_frame('a'))
Out[14]:
a
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 99
[11 rows x 1 columns]
In [15]: df.append(Series(99,[10],dtype='i4').to_frame('a')).dtypes
Out[15]:
a int32
dtype: object
错误/增强功能可以自动执行此操作的问题: https://github.com/pydata/pandas/issues/6485
An issue for the bug/enhancement to do this automagically: https://github.com/pydata/pandas/issues/6485
这篇关于向pandas DataFrame添加行会更改dtype的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!