pandas 字符串数据类型 [英] pandas string data types

查看:71
本文介绍了 pandas 字符串数据类型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想为熊猫read_csv指定数据类型.快速浏览一下在指定类型后起作用但不起作用的事物.后者为什么不起作用?

I want to specify data types for pandas read_csv. Here's a quick look at something that does work and then doesn't when types are specified. Why doesn't the latter work?

import io
import pandas as pd

csv = """foo,1234567,a,1 
foo,2345678,b,3 
bar,3456789,b,5 
"""

df = pd.read_csv(io.StringIO(csv),
        names=["fb", "num", "loc", "x"])

print(df)

df = pd.read_csv(io.StringIO(csv),
        names=["fb", "num", "loc", "x"], 
        dtype=["|S3", "np.int64", "|S1", "np.int8"])

print(df)

我已经进行了更新,以使这一点更加简单,希望在BrenBarn的建议中更加清楚.我的真实数据集要大得多,但是我想使用该方法为导入时的所有数据生成类型.

I've updated to make this much simpler and, hopefully, clearer on BrenBarn's suggestion. My real dataset is much larger, but I'd like to use the method to generate types for all my data on import.

推荐答案

正如Jeff指出的那样,我的语法不好.名称和类型必须压缩到dic样式的关系列表中.下面的代码可以工作,但是请注意,您不能dtype字符串宽度.您只能将其定义为对象.

As Jeff indicated, my syntax was bad. The names and types have to be zipped into a dic style list of relationships. The code below works, but note that you can't dtype a string width; you can only define it as an object.

import pandas as pd
import io

csv = """foo,1234567,a,1
foo,2345678,b,3
bar,3456789,b,5
"""

df = pd.read_csv(io.StringIO(csv),
        names = ["fb", "num", "ab", "x"], 
        dtype = {"fb" : object, "num" : np.int64, "ab" : object, "x" : np.int8})
print(df)

这篇关于 pandas 字符串数据类型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆