从对象到字符串的pandas dtype转换 [英] pandas dtype conversion from object to string
问题描述
我有一个csv文件,其中有几列是数字,几列是字符串.当我尝试myDF.dtypes
时,它将所有字符串列显示为object
.
I have a csv file that has a few columns which are numbers and few that are string. When I try myDF.dtypes
it shows me all the string columns as object
.
-
有人在此处关于为什么这样做.是否可以将
dtype
从对象重铸为字符串?
Someone asked a related question before here about why this is done. Is it possible to recast the
dtype
from object to string?
通常,还有什么简单的方法可以将dtype
从int64
和float64
重铸为int32
和float32
并保存数据大小(在内存/在磁盘上)?
Also, in general, is there any easy way to recast the dtype
from int64
and float64
to int32
and float32
and save on the size of the data (in memory / on disk)?
推荐答案
所有字符串均以可变长度表示(这是object
dtype所持有的).如果需要,可以执行series.astype('S32')
;否则,可以执行series.astype('S32')
.但是如果您随后将其存储在DataFrame中或对其进行大量处理,则会对其进行重铸.这是为了简单起见.
All strings are represented as variable-length (which is what object
dtype is holding). You can do series.astype('S32')
if you want; but it will be recast if you then store it in a DataFrame or do much with it. This is for simplicity.
某些序列化格式,例如HDFStore
但是将字符串作为定长字符串存储在磁盘上.
Certain serialization formats, e.g. HDFStore
stores the strings as fixed-length strings on disk though.
如果愿意,可以series.astype(int32)
,它将存储为新类型.
You can series.astype(int32)
if you would like and it will store as the new type.
这篇关于从对象到字符串的pandas dtype转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!