pandas DataFrame导出到to_csv更改列的dtype [英] Pandas DataFrame export to_csv change dtype of columns

查看:167
本文介绍了 pandas DataFrame导出到to_csv更改列的dtype的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

希望是一个简单的请求.

Hopefully a simple request.

我发现,当我构建一个DataFrame并设置列数据类型,然后将其导出到csv时,它正在将数字字符串的数据类型转换为整数.

I'm finding that when I build a DataFrame and set the column datatypes and then export it to csv it is doing a conversion on the datatype of a numerical string to an integer.

例如一个值可能是"0000",而csv最终以值0结束.但是我需要它来保留字符串中的字符数并将csv保存为"0000".

Such as a value might be "0000" and the csv ends up with value 0. But I need it to retain the number of characters in the string and save the csv as "0000".

有人知道保留字符串而不是转换后的数据类型的方法吗?

Anyone know of a way to retain the string rather than the converted datatype?

在导入之后设置数据类型并不能解决问题(在任何人告诉我可以在导入之前/之后进行设置),因为这会导致以下问题:将整数转换为字符串时,您还必须配置前导0每次导入也是如此,这不是最佳选择.

Setting the datatype after import doesn't solve the issue (before anyone tells me I can set it on/after import), as it causes the issue that when converting the integer to a string you have to also configure the leading 0s on every import as well, which is not optimal.

希望我忽略了一些简单的事情.

Hoping I'm overlooking something simple.

(编辑) 哦,我的导出行只是一个简单的导出,这就是为什么我可能没有意识到需要提供的参数的原因.

(EDIT) oh and my export line is just a simple export which is why it might be I'm just not realising the argument that needs to be provided.

df.to_csv("Test.csv", index=False)

推荐答案

假定df['your_column']是要保留的列,则可以在dtype参数. pydata.org/pandas-docs/stable/generation/pandas.read_csv.html"rel =" nofollow noreferrer> read_csv() :

Assuming that df['your_column'] is the column you want to preserve, you can use the dtype argument in read_csv():

df.read_csv('temp.csv', dtype={'your_column': str})

如果这不起作用,您确定您的列包含以字符串开头的字符串吗?因为这是我看到的行为:

If that's not working, are you sure your columns contain strings to begin with? Because here's the behavior I see:

>>> df1 = pd.DataFrame({'a': ['0000', '0000', '0100',]})
>>> df1
      a
0  0000
1  0000
2  0100
>>> df1.to_csv('temp.csv', index=False)
>>> df2.read_csv('temp.csv', dtype={'a': str})
>>> df2
      a
0  0000
1  0000
2  0100

也许您的问题不在于导出或导入,而在于创建.

Maybe your problem isn't on export or import, but on creation.

df = pd.DataFrame({'a': 0000, 0000, 0100]})

这将创建一个值为0,0,100的数据框.如果希望它们为字符串,则需要将它们创建为字符串.

This is going to make a dataframe with values 0,0,100. If you want them to be strings, you need to create them as strings.

这篇关于 pandas DataFrame导出到to_csv更改列的dtype的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆