从对象到字符串的pandas dtype转换 [英] pandas dtype conversion from object to string

查看:492
本文介绍了从对象到字符串的pandas dtype转换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个csv文件,其中有几列是数字,几列是字符串.当我尝试myDF.dtypes时,它将所有字符串列显示为object.

I have a csv file that has a few columns which are numbers and few that are string. When I try myDF.dtypes it shows me all the string columns as object.

  1. 有人在此处关于为什么这样做.是否可以将dtype从对象重铸为字符串?

  1. Someone asked a related question before here about why this is done. Is it possible to recast the dtype from object to string?

通常,还有什么简单的方法可以将dtypeint64float64重铸为int32float32并保存数据大小(在内存/在磁盘上)?

Also, in general, is there any easy way to recast the dtype from int64 and float64 to int32 and float32 and save on the size of the data (in memory / on disk)?

推荐答案

所有字符串均以可变长度表示(这是object dtype所持有的).如果需要,可以执行series.astype('S32');否则,可以执行series.astype('S32').但是如果您随后将其存储在DataFrame中或对其进行大量处理,则会对其进行重铸.这是为了简单起见.

All strings are represented as variable-length (which is what object dtype is holding). You can do series.astype('S32') if you want; but it will be recast if you then store it in a DataFrame or do much with it. This is for simplicity.

某些序列化格式,例如HDFStore但是将字符串作为定长字符串存储在磁盘上.

Certain serialization formats, e.g. HDFStore stores the strings as fixed-length strings on disk though.

如果愿意,可以series.astype(int32),它将存储为新类型.

You can series.astype(int32) if you would like and it will store as the new type.

这篇关于从对象到字符串的pandas dtype转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆