pandas :无法更改列数据类型 [英] Pandas: unable to change column data type

查看:121
本文介绍了 pandas :无法更改列数据类型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在此处的建议下更改了列数据类型熊猫数据框.但是,如果我用索引号而不是列名引用列,则似乎不起作用.有办法正确地做到这一点吗?

I was following the advice here to change the column data type of a pandas dataframe. However, it does not seem to work if I reference the columns by index numbers instead of column names. Is there a way to do this correctly?

In [49]: df.iloc[:, 4:].astype(int)
Out[49]: 
&ltclass 'pandas.core.frame.DataFrame'&gt
Int64Index: 5074 entries, 0 to 5073
Data columns (total 3 columns):
5    5074  non-null values
6    5074  non-null values
7    5074  non-null values
dtypes: int64(3) 

In [50]: df.iloc[:, 4:] = df.iloc[:, 4:].astype(int)

In [51]: df
Out[51]: 
&ltclass 'pandas.core.frame.DataFrame'&gt
Int64Index: 5074 entries, 0 to 5073
Data columns (total 7 columns):
1    5074  non-null values
2    5074  non-null values
3    5074  non-null values
4    5074  non-null values
5    5074  non-null values
6    5074  non-null values
7    5074  non-null values
dtypes: object(7) 

In [52]: 

推荐答案

做到这一点

In [49]: df = DataFrame([['1','2','3','.4',5,6.,'foo']],columns=list('ABCDEFG'))

In [50]: df
Out[50]: 
   A  B  C   D  E  F    G
0  1  2  3  .4  5  6  foo

In [51]: df.dtypes
Out[51]: 
A     object
B     object
C     object
D     object
E      int64
F    float64
G     object
dtype: object

需要一对一分配列

In [52]: for k, v in df.iloc[:,0:4].convert_objects(convert_numeric=True).iteritems():
    df[k] = v
   ....:     

In [53]: df.dtypes
Out[53]: 
A      int64
B      int64
C      int64
D    float64
E      int64
F    float64
G     object
dtype: object

转换对象通常会做正确的事,因此最容易做到

Convert objects usually does the right thing, so easiest to do this

In [54]: df = DataFrame([['1','2','3','.4',5,6.,'foo']],columns=list('ABCDEFG'))

In [55]: df.convert_objects(convert_numeric=True).dtypes
Out[55]: 
A      int64
B      int64
C      int64
D    float64
E      int64
F    float64
G     object
dtype: object

通过在右侧的一系列分配的

会根据需要复制数据更改类型,因此我认为这在理论上应该可行,但是我怀疑这遇到了一个非常模糊的错误,导致无法防止对象dtype从更改为 real (意思是int/float)dtype.应该现在应该加薪.

assigning via df.iloc[:,4:] with a series on the right-hand side copies the data changing type as needed, so I think this should work in theory, but I suspect that this is hitting a very obscure bug that prevents the object dtype from changing to a real (meaning int/float) dtype. Should probably raise for now.

在此跟踪此问题: https://github.com/pydata/pandas/issues /4312

这篇关于 pandas :无法更改列数据类型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆